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Abstract 

While in many graph mining applications it is crucial to handle a stream of updates efficiently 
in terms of both time and space, not much was known about achieving such type of algorithm. 
In this paper we study this issue for a problem which lies at the core of many graph mining 
applications called densest subgraph problem. We develop an algorithm that achieves time- and 
space-efficiency for this problem simultaneously. It is one of the first of its kind for graph 
problems to the best of our knowledge. 

Given an input graph G = (V, E ), the “density” of a subgraph induced by a subset of nodes 
S C P is defined as |i£(S')|/|S'|, where E(S) denotes the set of edges in E with both endpoints 
in S. In the densest subgraph problem, the goal is to find a subset of nodes that maximizes the 
density of the corresponding induced subgraph. 

For any e > 0, we present a dynamic algorithm that, with high probability, maintains a (4+e)- 
approximate solution for the densest subgraph problem under a sequence of edge insertions and 
deletions in an input graph with n nodes. The algorithm uses 0(n) space, and has an amortized 
update time of 0(1) and a query time of 0(1). Here, O hides a 0(poly log 1+e n) term. The 
approximation ratio can be improved to (2 + e) at the cost of increasing the query time to O(n). 
It can be extended to a (2 + e)-approximation sublinear-time algorithm and a distributed- 
streaming algorithm. Our algorithm is the first streaming algorithm that can maintain the 
densest subgraph in one pass. Prior to this, no algorithm could do so even in the special 
case of an incremental stream and even when there is no time restriction. The previously best 
algorithm in this setting required 0(log n) passes [Bahmani, Kumar and Vassilvitskii, VLDB’12]. 
The space required by our algorithm is tight up to a polylogarithmic factor. 
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1 Introduction 


In analyzing large-scale rapidly-changing graphs, it is crucial that algorithms must use small space 
and adapt to the change quickly. This is the main subject of interest in at least two areas, namely 
data streams and dynamic algorithms. In the context of graph problems, both areas are interested 
in maintaining some graph property, such as connectivity or distances, for graphs undergoing a 
stream of edge insertions and deletions. This is known as the (one-pass) dynamic semi-streaming 
model in the data streams community, and as the fully-dynamic model in the dynamic algorithm 
community. 

The two areas have been actively studied since at least the early 80s (e.g. [17, 35]) and have 
produced several sophisticated techniques for achieving time and space efficiency. In dynamic 
algorithms, where the primary concern is time, the heavy use of amortized analysis has led to 
several extremely fast algorithms that can process updates and answer queries in poly-logarithmic 
amortized time. In data streams, where the primary concern is space, the heavy use of sampling 
techniques to maintain small sketches has led to algorithms that require space significantly less than 
the input size; in particular, for dynamic graph streams the result by Ahn, Guha, and McGregor 
[ 1 ] has demonstrated the power of linear graph sketches in the dynamic model, and initiated an 
extensive study of dynamic graph streams (e.g. [1-3, 26, 27]). Despite numerous successes in these 
two areas, we are not aware of many results that combine techniques from both areas to achieve 
time- and space-efficiency simultaneously in dynamic graph streams. A notable exception we are 
aware of is the connectivity problem, where one can combine the space-efficient streaming algorithm 
of Ahn et al. [2] with the fully-dynamic algorithm of Kapron et al. [28] 1 . 

1.1 Problem definition 

In this paper, we study the densest subgraph problem in dynamic and streaming setting. Fix any 
unweighted undirected input graph G = (V,E). The density of a subgraph induced by the set of 
nodes H CF is defined as p(H) = \E(H)\/\H\, where E(H) = {(u,u) € E : u, v € H} is the set of 
edges in the induced subgraph. The densest subgraph of G is the subgraph induced by a node set 
H C V that maximizes p(H), and we denote the density of such a subgraph by p*{G) = max p(H). 

For any 7 > 1 and 77 , we say that 77 is an 7 -approximate value of p*(G) if p*(G)/'y < 7 < p*(G). 
The (static) densest subgraph problem is to compute or approximate p*(G) and the corresponding 
subgraph. Throughout the paper, we use n= \ V\ and m = \E\ to denote the number of nodes and 
edges in the input graph, respectively. 

This problem and its variants have been intensively studied in practical areas as it is an impor¬ 
tant primitive in analyzing massive graphs. Its applications range from identifying dense communi¬ 
ties in social networks (e.g. [13]), link spam detection (e.g. [18]) and finding stories and events (e.g. 
[4]); for many more applications of this problem see, e.g., [ 6 , 31, 42, 43]. Goldberg [20] was one of 
the first to study this problem although the notion of graph density has been around much earlier 
(e.g. [30, Chapter 4]). His algorithm can solve this problem in polynomial time by using O(logn) 
flow computations. Later Gallo, Grigoriadis and Tarjan slightly improved the running time using 
parametric maximum flow computation. These algorithms are, however, not very practical, and an 
algorithm that is more popular in practice is an 0 (m)-time 0 (m)-space 2 -approximation algorithm 
of Charikar [9]. However, as mentioned earlier, graphs arising in modern applications are huge 
and keep changing, and the earlier algorithms cannot handle edge insertions/deletions in the input 
graph. Consider, for example, an application of detecting a dense community in social networks. 

1 We thank Valerie King (private communication) for pointing out this fact. 
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Since people can make new friends as well as “unfriend” their old friends, the algorithm must be 
able to process these updates efficiently. With this motivation, it is natural to consider a dynamic 
version of this problem as defined below. 

Our Model. We start with an empty graph G = (V, E ) where E = 0. Subsequently, at each 
time-step, an adversary either inserts an edge into the graph, or deletes an already existing edge 
from the graph. The set of nodes, on the other hand, remain unchanged. The goal is to maintain 
a good approximation to the value of the densest subgraph while processing this sequence of edge 
insertions/deletions. More formally, we want to design a data structure for the input graph G = 
(V, E) that supports the following operations. 

• Initialize^): Initialize the data structure with an empty graph G = (V, E) where E = 0. 

• Insert(u, v): Insert the edge (u,v), where u,v € V, into the graph G. 

• Delete(u, v): Delete an existing edge (u,v) € E from the graph G. 

• Query Value: Return an estimate of the value of the maximum density p*(G) = maxgcy p(S). 
If this estimate is always within a 7 -factor of p*{G), for some 7 > 1 , then we say that the 
algorithm maintains a 7 -approximation to the value of the densest subgraph. We want this 
approximation factor to be a small constant. 

The performance of a data structure is measured in term of four different metrics, as defined below. 

• Space-complexity: This is given by the total space (in terms of bits) used by the data structure. 

• Update-time: This is the time taken to handle an Insert or Delete operation. 

• Query-time: This is the time taken to handle a QueryValue operation. 

• Preprocessing-time: This is the time taken to handle the Initialize operation. Unless ex¬ 
plicitly mentioned otherwise, in this paper the preprocessing time will always be O(n). 

Comparison with the semi-streaming model. In the streaming algorithms literature, the 
“ semi-streaming model ’ for graph problems is defined as follows. We start with an empty graph of 
n nodes. Subsequently, we have to process a “ stream ” of updates in the graph. For “ insert-only ” 
streams, each update consist of inserting a new edge into the graph. For “ dynamic ” (or, “turnstile”) 
streams, each update consists of either inserting a new edge into the graph or deleting an already 
existing edge from the graph. 

A “semi-streaming algorithm!'’ can use only 0(n ) bits of space while processing a stream of 
updates. In particular, the algorithm cannot store all the edges in the graph (which might require 
D(n 2 ) space). At the end of the stream, the algorithm has to output an (approximate) solution 
to the problem concerned, which, in our case, happens to be the value of the densest subgraph. 
The algorithm is allowed to make “multiple passes” over this stream. Typically, in the streaming 
algorithms literature the focus is on the space complexity, and optimizing the update-time and the 
query-time (which can be as large as fl(n)) are of secondary importance. 

Our goal. We want to design algorithms that maintain constant factor approximations to the 
value of the densest subgraph in a dynamic setting, have very fast (polylogarithmic in n) update 
and query times, and use very little (near-linear in n) space. In other words, we want single-pass 
semi-streaming algorithms over dynamic streams with polylogarithmic update and query times. 
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Remark on the query operation. The QueryValue operation described above asks only for 
an estimate of the value p*{G). This raises a natural question: Can we answer a more general 
query that asks for a subset of nodes which constitute an approximate densest subgraph (in time 
proportional to the number of nodes returned in response to the query)? The answer is yes. We 
can easily extend all the algorithms presented in this paper so as to enable them with this new 
feature (see the discussion after the proof of Corollary 2.7). 

1.2 Our Results 

Our main result is an efficient (4 + e)-approximation algorithm for this problem (see Theorem 1.1). 
To be more specific, we present a randomized algorithm that can process a stream of polynomially 
many edge insertions/deletions starting from an empty graph using only 0(n ) space, and with high 
probability, the algorithm maintains a (4 + e)-approximation to the value of the densest subgraph. 
The algorithm has 0(1) amortized update-time and 0(1) query-time. 

For every integer t > 0, let G® = (V,E^) be the state of the input graph G = (V, E) just 
after we have processed the first t updates (edge insertions/deletions) in the dynamic stream, and 
define ra® «— \E^\. Thus, we have m = 0 and m® > 0 for all t > 1. We let Opt^ = p*(G®) 
denote the density of the densest subgraph in G ( - f) . 

Notation. Throughout this paper, the notations 0(.) and 0(.) will hide poly(logn, 1/e) factors 
in the running times and space complexities of our algorithms, where e € (0,1) is a small constant. 

Theorem 1.1. Fix a small constant e € (0,1), a constant A > 1. and let T = [~n A ~|. We can 
process the first T updates (edge insertions/deletions) in a dynamic stream using O(n) space, and 
maintain a value Output^ at each t € [T], The algorithm gives the following guarantees with 
high probability: We have Opt^/(4 + 0(e)) < OUTPUT^ < Opt^ for all t £ [T]. Further, the 
total amount of computation performed while processing the first T updates in the stream is 0(T). 

Oblivious Adversary. We remark that Theorem 1.1 holds only when the sequence of edge 
insertions/deletions in the input graph does not depend on the random bits used by our algorithm. 
In other words, the “adversary”, who decides upon the sequence of edge insertions/deletions, is 
“oblivious” to the random bits used in the algorithm. This is a standard assumption in the graph 
streaming literature. For example, the paper by Ahn, Guha and McGregor [1] also requires this 
assumption on the adversary. We prove Theorem 1.1 in Section 5. In addition, we obtain the 
following results. 

• A (2 + e)-approximation one-pass dynamic semi-streaming algorithm: This follows 
from the fact that with the same space, preprocessing time, and update time, and an additional 
0(n) query time, our main algorithm can output a (2 + e)-approximate solution. See Section 3. 

• A (4 + e)-approximation deterministic dynamic algorithm with 0(1) update time. In 

Section 4, we present a deterministic algorithm that maintains a (4 + e)-approximation to the value 
of the densest subgraph. This requires 0(m + n ) space, and 0(1) update and query times. 

• Extensions to directed graphs. In Section 6, we extend our result from Section 4 to di¬ 
rected graphs. Specifically, we present a deterministic dynamic algorithm that maintains a (8 + e)- 
approximation to the value of the densest subgraph of a directed graph. This requires 0(m + n) 
space, and 0(1) update and query times. 
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• Sublinear-time algorithm: We show that Charikar’s linear-time linear-space algorithm [9] 
can be improved further. In particular, if the graph is represented by an incident list (this is a 
standard representation [10, 19]), our algorithm needs to read only 0[n ) edges in the graph (even if 
the graph is dense) and requires 0{n ) time to output a (2+e)-approxinrate solution. We also provide 
a lower bound that matches this running time up to a poly-logarithmic factor. See Section 7. 

• Distributed streaming algorithm: In the distributed streaming setting with k sites as de¬ 
fined in [ 11 ], we can compute a (2 + e)-approximate solution with 0{k + n) communication by 
employing the algorithm of Cormode et al. [11]. See Section 8 . 

1.3 Previous work 

To the best of our knowledge, our main algorithm is the first dynamic graph algorithm that requires 
0(n ) space and at the same time can quickly process each update and answer each query for 
densest subgraph. Previously, there was no space-efficient algorithm known for this problem, even 
when time efficiency is not a concern, and even for insert-only streams. In this insert-only model, 
Bahmani, Kumar, and Vassilvitskii [ 6 ] provided a deterministic (2 + e)-approximation 0(n)-space 
algorithm. Their algorithm needs 0(log 1 +e n) passes; i.e., it has to read through the sequence of 
edge insertions 0(log 1 +e n) times. (Their algorithm was also extended to a MapReduce algorithm, 
which was later improved by [5].) In Section 3, we improve this result of Bahmani et al. in two 
respects: (a) We can process a dynamic stream of updates, and (b) we need only a single pass. 
Further, the space usage of our algorithm from Section 3 matches the lower bound provided by [ 6 , 
Lemma 7] up to a polylogarithmic factor. 

We note that while in some settings it is reasonable to compute the solution at the end of the 
stream or even make multiple passes (e.g. when the graph is kept on an external memory), and 
thus our and Bahmani et al’s (2 + e)-approximation algorithms are sufficient in these settings, there 
are many natural settings where the stream keeps changing, e.g. social networks where users keep 
making new friends and disconnecting from old friends. In the latter case our main algorithm is 
necessary since it can quickly prepare to answer the densest subgraph query after every update. 

Another related result in the streaming setting is by Ahn et al. [2] which approximates the 
fraction of some dense subgraphs such as a small clique in dynamic streams. This algorithm does 
not solve the densest subgraph problem but might be useful for similar applications. 

Not much was known about time-efficient algorithm for this problem even when space efficiency 
is not a concern. One possibility is to adapt dynamic algorithms for the related problem called 
dynamic arboricity. The arboricity of a graph G is a(G) = rna x.jjcv(G) \E{U)\/{\U\ ~ 1) where E(U) 
is the set of edges of G that belong to the subgraph induced by U. Observe that p*(G ) < a(G) < 
2 p*{G). Thus, a 7 -approximation algorithm for the arboricity problem will be a ( 27 )-approximation 
algorithm for densest subgraph. In particular, we can use the 4-approximation algorithm of Brodal 
and Fagerberg [7] to maintain an 8 -approximate solution to the densest subgraph problem in 0(1) 
amortized update time. (With a little more thought, one can in fact improve the approximation 
ratio to 6 .) 

In a work that appeared at about the same time as the preliminary version of this paper, 
Epasto et al. [14] presented a (2 + e)-approximation algorithm for densest subgraph which can 
handle arbitrary edge insertions and random edge deletions. After the preliminary version of our 
paper appeared, Esfandiari et al. [16] and McGregor et al. [33] presented semi-streaming algorithms 
for densest subgraph that give (1 + e)-approximation and require 0(n) space. The same result was 
obtained independently by Mitzenmacher et al. [34]. These improve the approximation ratio of our 
(2 + e)-approximation semi-streaming algorithm. Like our (2 + e)-approximation algorithm, their 
algorithms have an update-time of 0 ( 1 ), but the query-time can be as large as Q (n). 
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1.4 Overview of our techniques 

An intuitive way to combine techniques from data streams and dynamic algorithms for any problem 
is to run the dynamic algorithm using the sketch produced by the streaming algorithm as an input. 
This idea does not work straightforwardly. The first obvious issue is that the streaming algorithm 
might take excessively long time to maintain its sketch and the dynamic algorithm might require 
an excessively large additional space. A more subtle issue is that the sketch might need to be 
processed in a specific way to recover a solution, and the dynamic algorithm might not be able 
to facilitate this. As an extreme example, imagine that the sketch for our problem is not even a 
graph; in this case, we cannot even feed this sketch to a dynamic algorithm as an input. 

The key idea that allows us to get around this difficulty is to develop streaming and dynamic 
algorithms based on the same structure called (a, d, L)-decomposition. This structure is an ex¬ 
tension of a concept called d-core, which was studied in graph theory since at least the 60s (e.g., 
[15, 32, 41]) and has played an important role in the studies of the densest subgraph problem (e.g., 
[6, 40]). The d- core of a graph is its (unique) largest induced subgraph with every node having 
degree at least d. It can be computed by repeatedly removing nodes of degree less than d from 
the graph, and can be used to 2-approximate the densest subgraph. Our (a, d, L)-decomposition 
with parameter a > 1 is an approximate version of this process where we repeatedly remove nodes 
of degree “approximately” less than d: in this decomposition we must remove all nodes of degree 
less than d and are allowed to remove some nodes of degree between d and ad. We will repeat this 
process for L iterations. Note that the (a, d , L)-decomposition of a graph is not unique. However, 
for L = 0(log 1+e n), an (a, d, /^-decomposition can be use to 2a(l + e) 2 -approximate the densest 
subgraph. We explain this concept in detail in Section 2.2. 

We show that this concept can be used to obtain an approximate solution to the densest 
subgraph problem and leads to both a streaming algorithm with a small sketch and a dynamic 
algorithm with small amortized update time. In particular, it is intuitive that to check if a node 
has degree approximately </, it suffices to sample every edge with probability roughly 1/d. The 
value of d that we are interested in is approximately p*(G), which can be shown to be roughly the 
same as the average degree of the graph. Using this fact, it follows almost immediately that we 
only have to sample 0(n ) edges. Thus, to repeatedly remove nodes for L iterations, we will need 
to sample 0(Ln ) = 0(n) edges (we need to sample a new set of edges in every iteration to avoid 
dependencies). 

We turn the (a, d, /^-decomposition concept into a dynamic algorithm by dynamically main¬ 
taining the sets of nodes removed in each of the L iterations, called levels. Since the (a, d, L)- 
decomposition gives us a choice whether to keep or remove each node of degree between d and 
ad , we can save time needed to maintain this decomposition by moving nodes between levels only 
when it is necessary. If we allow a to be large enough, nodes will not be moved often and we can 
obtain a small amortized update time; in particular, it can be shown that the amortized update 
time is 0(1) if a > 2 + e. In analyzing an amortized time, it is usually tricky to come up with the 
right potential function that can keep track of the cost of moving nodes between levels, which is 
not frequent but expensive. In case of our algorithm, we have to define two potential functions for 
our amortized analysis, one on nodes and one on edges. (For intuition, we provide an analysis for 
the simpler case where we run this dynamic algorithm directly on the input graph in Section 4.) 

Our goal is to run the dynamic algorithm on top of the sketch maintained by our streaming 
algorithm in order to maintain the ( a , d, /^-decomposition. To do this, there are a few issues we 
have to deal with that makes the analysis rather complicated: In the sketch we maintain L sets of 
sampled edges, and for each of the L iterations we use different such sets to determine which nodes 
to remove. This causes the potential functions and its analysis to be even more complicated since 
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whether a node should be moved from one level to another depends on its degree in one set, but 
the cost of moving such node depends on its degree in other sets as well. The analysis, however, 
goes through (intuitively because all sets are sampled from the same graph and so their degree 
distributions are close enough). See Section 5 for further details. 

1.5 Roadmap 

The rest of the paper is organized as follows. 

• We define the preliminary concepts and notations in Section 2. 

• In Section 3, we present an algorithm that returns a (2 + e)-approximation to the value of the 
densest subgraph. The algorithm processes a stream of edge insertions/deletions using only 
0(n ) bits of space, and at the end of the stream returns an estimate of p*(G) in O(n) time. 
The output of the algorithm is correct with high probability. 

• In Section 4, we present a deterministic algorithm that maintains a (4 + e)-approximation to 
the value of the densest subgraph in 0(m + n) space. It has 0(1) update and query times. 

• We present our main result in Section 5. Specifically, combining the techniques from Sections 3 
and 4, we design an algorithm that maintains a (4 + e)-approximation to the value of the 
densest subgraph with high probability, and requires only 0(n) space and 0 ( 1 ) update time. 

• In Section 6 , we extend the result from Section 4 to directed graphs. Specifically, in a directed 
graph, we present a deterministic algorithm that maintains an (8 + e)-approximation to the 
value of the densest subgraph using 0(m + n) space. It has 0(1) update and query times. 

• In Sections 7 and 8 we present simple extensions of our result from Section 3, giving sub linear 
time and distributed-streaming algorithms for densest subgraph. 


2 Notations and Preliminaries 

We start by defining some notations that will be used throughout the rest of the paper. We 
denote the input graph by G = (V,E). It has n = |P| nodes and m = \E\ edges. Let A f v = 
{u € V : (u,v) € E} and D v = |A/),| respectively denote the set of neighbors and the degree of a 
node v G V. Consider any subset of nodes S C V. Let E(S) = {(u,u) G E : u, v G S} denote 
the set of edges with both endpoints in S, and let G(S) = ( V,E(S )) denote the subgraph of G 
induced by the nodes in S. Further, given any subset of edges E' C E and any node u € V, 
define J\T U (S,E') = {u G J\f u H S : (u,v) G E'} and D U (S,E') = \Af u (S, E')\. In other words, 
J\f u (S , E') is the subset of nodes in S that are neighbors of u in the subgraph induced by the edges 
in E' , whereas D U (S,E') denotes the degree of u among the nodes in S in the same subgraph. 
For simplicity, we write A f u (S) and D U (S) instead of A T U (S,E) and D U (S,E). If the set of nodes 
S C V is nonempty, then its density and average-degree are defined as p(S) = |F1(S')|/|S , | and 
<5('S') = Ylv&s L?„(S')/|iS'| respectively. Throughout the paper, the symbol 0(.) will be used to hide 
poly(log n, 1 /e) factors in the running times of our algorithms, where e > 0 is some arbitrary small 
constant (the approximation guarantee will depend on e). Finally, for any positive integer k, we 
will use the symbol [k] to denote the set {1 ,..., k}. 

This paper deals with the “Densest Subgraph Problem”, which is about finding a subset of 
nodes of maximum density. Specifically, we want to find a subset of nodes S C V in the input 
graph G = (V,E) that maximizes p(S). Further, a subset S C V is called a 7 - approximate densest 
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subgraph , for 7 > 1 , iff 7 • p(S) > maxg/gy p(S'). We will consider the densest subgraph problem 
in a dynamic setting. For a detailed description of our model, see Section 1.1. The main result of 
this paper is summarized below. 

Theorem 2.1. There is a dynamic data structure for the densest subgraph problem that requires 
0(n) bits of space, has an amortized update time of 0(1), a query time of 0(1), and with high 
probability maintains a (4 + e)-approximation to the value of the densest subgraph. 

It follows that Theorem 2.1 gives a single-pass semi-streaming algorithm over dynamic streams 
for the approximate densest subgraph problem. And, unlike most other semi-streaming algorithms, 
Theorem 2.1 gives very fast update and query times. 

2.1 Three basic properties 

We now state three basic lemmas that will be used throughout the rest of the paper. The first 
lemma shows that the average degree of a set of nodes is twice its density. 

Lemma 2.2. For all S C V, we have 5(S) = 2 • p(S). 

Proof. We have 5(S) = YIvgS -D.u(5)/|5| = 2 • |£’(5)|/|5| = 2 • p(S). The second equality holds 
since every edge is incident upon two nodes. □ 

The second lemma gives simple upper and lower bounds on the maximum density of a subgraph. 

Lemma 2.3. Let d* = max^gy p(S) be the maximum density of any subgraph in G. Then m/n < 
d* < n. 

Proof. Clearly, we have d* > p(V) = |FJ|/|V| = m/n. On the other hand, consider any subset 
of nodes S' C V. We have p(S') = |-E(5')|/|5'| < n|5'|/|5'| = n. The inequality holds since the 
maximum degree of a node is (n — 1), and hence the subgraph induced by the nodes in S' can have 
at most n|5'| edges. Thus, we get: d* = max£/gy p(S') < n. □ 

The final lemma will also be very helpful in analyzing our algorithm in later sections. 

Lemma 2.4. Let S* C V be a subset of nodes with maximum density, i.e., p(S*) > p(S) for all 
S C V. Then D V (S*) > p(S *) for all v € S*. Thus, the degree of each node in G(S*) is at least 
the density of S*. 

Proof. Suppose that there is a node v € 5* with Dg*(v) < p(S*). Define the set S' <— S* \ {u}. 
We derive the following bound on the average degree in S'. 

YsueS' D 0 ( u ) 

\S'\ 

J2ues* D s * (u) - 2 • Ds * (v) 

1 5*| - 1 

S(S*) • |5*| -2-D s *(v) 

1 5*| - 1 

> —— ^ I —-—-—- (since by assumption Ds*(v) < p(S*) = <5(5*)/2) 

|d*| — 1 

= 5(S*) 

Since 5(S') > 5(S*), we infer that p(S') > p(S*). But this contradicts the assumption that the 
subset of nodes S* has maximum density. Thus, we conclude that Ds*(v) > p(S*) for every node 
veS*. □ 


d(S') 


7 






2.2 (a, d, L)-decomposition 

Our algorithms will use the concept of an “(a, d, L)-decomposition”, as defined below. To give 
some intuitions behind Definition 2.5, suppose that we start by setting Z\ V. Next, suppose 
that we have already constructed the subsets Z\ D ■ ■ ■ D for some positive integer i < L. While 
constructing the next subset Z l+ \, we ensure that the following two conditions are satisfied. 

• All the nodes v € Z{ with D v (Zi ) > ad must be included in Zj_|_i. 

• All the nodes v € Zj with D v (Zi ) < d must be excluded from Z l+ \. 

Using this iterative procedure, we can build an ( a , d, L)-decomposition. 

Definition 2.5. Fix any a > 1, d > 0, and any positive integer L. Consider a family of subsets 
Z\ D ••• D Zl- The tuple (Z\,..., Zl) is an (a, d, L)-decomposition of the input graph G = 
(V, E) iff Z\ = V and, for every i € [L — 1], we have Zj + i D {v € Z* : D v {Zf) > ad} and z i+1 n 
{v e Zi : D v (Zi) < d} = 0. 

Given an (a, d, L)-decomposition (Zi,..., Zl), we define U; = Z r \ Zj + i for all i € [L — 1 ], and 
Vi = Zj for i = L. We say that the nodes in V, constitute the i th level of this decomposition. We 
also denote the level of a node v € V by £(v). Thus, we have £(y) = i whenever v € Vi. 

The following theorem and its immediate corollary will be of crucial importance. Roughly 
speaking, they state that we can use the ( a , d, L)-decomposition to 2 a(l + e) 2 -approximate the 
densest subgraph by trying different values of d in powers of (1 + e). 

Theorem 2.6. Fix any a > 1, d > 0, e € (0,1), L = 2+ |"log( 1+e ) n]. Let d* = maxscv p(S) be the 
maximum density of any subgraph in G = (V, E), and let (Zi,..., Zl) be an ( a , d, L)-decomposition 
of G = (V,E). Then we have: 

1. If d > 2(1 + e)d*, then Zl = 0. 

2. Else if d < d*/a, then Zl / 0 and there is an index j € {1,..., L — 1} such that p(Zj) > 
d/{ 2(1 + e)). 


Proof. 

1. Suppose that d > 2(1 + e)d*. Consider any level i € [L — 1], and note that 5{Zf) = 2 • piZf) < 
2 • maxscy p{S) = 2 d* < d/(l + e). ft follows that the number of nodes v in G{Zf) with 
degree D v {Zf) > d is less than |Zj|/(l + e), as otherwise 5{Zf) > d/(l + e). Let us define the 
set Ci = {v € Zj : D v {Zf) < d}. We have \Zi\Cf\ < |Zj|/(l + e). Now, from Definition 2.5 we 
have Zj_|_i n C* = 0, which, in turn, implies that |Zj + i| < |Zj \ Cf < |Zj|/(l + e). Thus, for all 
i € [L — 1], we have |Zj + i| < |Zj|/(l + e). Multiplying all these inequalities, for i = 1 to L— 1, 
we conclude that \Zl\ < |Zi|/(l + e) i_1 . Since |Zi| = |U| = n and L = 2 + |"log( 1+e ) n], we 
get |Z L | < n/(l + e )( 1 + 1 °g(i+ £ ) "■) <; i. This can happen only if Zl = 0. 

2. Suppose that d < d*/a, and let S* C V be a subset of nodes with highest density, i.e., 
p(S*) = d*. We will show that S* C Z* for all z € {1,... ,L}. This will imply that Zl 7 ^ 0. 
Clearly, we have S* C V = Z\. By induction hypothesis, assume that S* C Z* for some 
z € [L — 1]. We show that S* C Zj + i. By Lemma 2.4, for every node v € S'*, we have 
D v (Zi ) > D V (S*) > p(S*) = d* > ad. Hence, from Definition 2.5, we get v € Zj + i for all 
v € 5*. This implies that 5* C Zj + i . 


Next, we will show that if d < d*/a, then there is an index j G {1,... ,L — 1} such that 
p{Zj) > d/( 2(1 + e)). For the sake of contradiction, suppose that this is not the case. Then 
we have d < d*/a and 5{Zi) = 2 • p(Z {) < d/{ 1 + e) for every i G {1,... ,L — 1}. Then, 
applying an argument similar to case (1), we conclude that |4 + i| < |4|/(1 + e) for every 
i G {1,..., L — 1}, which implies that Zl = 0. Thus, we arrive at a contradiction. 

□ 

Corollary 2.7. Fix a,e,L,d* as in Theorem 2.6. Let tt,o > 0 be any two numbers satisfying 
a - n < d* < a/ (2(1 + e)). Fix any integer K > 2 + |~log( 1+e ) (cr/7r)]. Discretize the range [it, a] into 
powers o/( 1 + e), by defining dk = (l + e)^’ -1 -it for every k G [K\. Next, for every k G [K], construct 
an ( a, dk, L)-decomposition (Z\(k),..., Zi{k)) of G = (V, E). Let k' = max{£; G [if] : Zi{k) / 0}. 
T/ien u>e haxe f/ie following guarantees: 

1. d*/ (a(l + e)) < 4/ < 2(1 + e) 2 • 4. 

2. There exists an index j' G {1,..., L — 1} such that p(Zji(k')) > 4'/(2(l + e)). 

Proof. 

1. Note that n < d*/a and cr > 24(1 + e). Furthermore, we have 4 < it and dx > (1 + e)a. 

This implies that 4 < d*/a and dx > 2d*(l + e) 2 . Next, note that successive dk values differ 
from each other by a factor of (1 + e). Accordingly, there exists some index k G [if] for which 
d*/(a( 1 + e)) < dk < 2(l + e) 2 -4. In other words, the set Q = {/c G [if] : 4/(a(l + e)) < 4 < 
2(1 + e) 2 • 4} is nonempty. Let hi = minfc e g{h} and 4 = nrax^gglh} respectively denote 
the minimum and maximum indices in the set Q. Observe that the set Q is “contiguous”, 
i.e., Q = {hi, hi + 1,..., h 2 }. Since the dk values are discretized in powers of (1 + e), we have 
hi < d*/a and h 2 > 24(1 + e). Hence, by Theorem 2.6, we have -4(4) 0, and Zi,(k) = 0 

for all h > h 2 - It follows that the index k! must satisfy the inequality hi < k' < h’ 2 , which 
means that k' G Q. Thus, we have d*/(a( 1 + e) < h' < 2(1 + e) 2 4. 

2. Suppose that the claim is false. Then we have ^(h 7 ) 0 and 5(Zi(k')) = 2 • p(Zi(k')) < 

4'/(1 + e) fo r every i G {1,..., L— 1}. Then, applying an argument similar to the proof of case 
(1) in Theorem 2.6, we conclude that |4+i(^ / )l + |4(^ / )|/(1 + e) for every i G {1, ..., L— 1}, 
which implies that Zi(k') = 0. Thus, we arrive at a contradiction. 

□ 

We will use the above corollary as follows. Lemma 2.3 states that m/n < 4 < n. Thus, in 
Corollary 2.7, we can choose the values of it, a and K in such a way which ensures that K = 0(1). 
Hence, to maintain a 2a(l + e)'^ = (2a +0(e))-approximation of the maximum density, it suffices to 
maintain K = 0(1) many (a, d, F)-decompositions and to keep track of the maximum d for which 
the topmost level (i.e., the node set Vl ) of the decomposition is nonempty. This gives a query 
time of 0(1). In addition, if we want to answer a more general query which asks us to output a 
subgraph of approximate maximum density, then we simply keep track of the densities of all the 
node-induced subgraphs of the form Zfik), where i G [L — 1], k G [if], and output the one among 
them with maximum density. Since there are only if = 0(1) many decompositions to consider, and 
since each such decomposition has L = 0(1) levels, this can be done by incurring an additional cost 
of no more than 0(ifL) = 0(1) in the update time. It turns our that this simple extension applies 
to all the dynamic and streaming algorithms presented in the paper. Accordingly, for simplicity of 
exposition, from now on we only focus on the simpler query which asks for an estimate of the value 
of the densest subgraph (and not the subgraph itself). 
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2.3 Two results on £o _sam pli n g and uniform hashing 

We now state a well known result on .^-sampling i n the streaming setting. All the streaming 
algorithms in this paper will use this result. 

Lemma 2.8 (^-sampler [24]). We can process a dynamic stream of 0(poly n) updates in the graph 
G = (V,E) in 0(1) space, and with high probability, at each step we can maintain a simple random 
sample from the set E. The algorithm takes 0(1) time to handle each update in the stream. 

The next lemma deals with uniform hashing in constant time and optimal space. We will use 
this lemma in Section 5.2. 

Lemma 2.9. [37]. Let E* = (^) be the set of all possible unordered pairs of nodes in V. 

Consider any two integers w,q> 1. We can construct a w-wise independent uniform hash function 
h : E* —>• [q] using 0(w polylfog w, log q, log n)) bits of space. Given any e € E*, the hash value 
h(e) can be evaluated in 0(1) time. 

2.4 Concentration bounds 

We will use the following concentration bounds throughout the rest of this paper. 

Theorem 2.10. (Chernoff bound-I) Consider a collection of mutually independent random vari¬ 
ables {X\, ..., X t } such that Xi € [0,1] for alii G {1,..., t}. Let X = ffi-i Xi be the sum of these 
random variables. Then we have Pr[A > (1 + e)p] < e~ e whenever E[X) < p. 

Theorem 2.11. (Chernoff bound-II) Consider a set of mutually independent random variables 
{Xi,... ,X t } such that Xj € [0,1] for all i € {1,... ,t}. Let X = JT =1 A; be the sum of these 
random variables. Then we have Pr[X < (1 — e)p] < e~ e whenever E[X] > p. 

Definition 2.12. (Negative association) A set of random variables {X\,...,Xt] are negatively 
associated iff for all disjoint subsets I, J C {1, ... ,t} and all non-decreasing functions f and g, we 
have E(f(Xi,i G I) • g(X j ,j € J)] < E[f(X h i£ I)} • E\g(Xj,j € J)]. 

Theorem 2.13. (Chernoff bound with negative dependence) The Chernoff bounds, as stated in 
Theorems 2.10 and 2.11, hold even if the random variables {Xi,... ,Xt} are negatively associated. 

3 A Semi-Streaming Algorithm 

In this section, we present a single-pass semi-streaming algorithm for the densest subgraph problem. 
The algorithm requires only 0(n) bits of space, and at the end of the stream outputs a (2 + e)- 
approximation to the value of the densest subgraph with high probability. On the negative side, 
its update time can be as large as Ll(n). Our result in this section is stated in the theorem below. 

Theorem 3.1. In a single pass, we can process a dynamic stream of updates in the graph G in 0(n) 
space. With high probability, we can return a (2 + 0(e))-approximation of the maximum density 
d* = max 5 cy p(S) at the end of the stream. 

We devote the rest of this section to the proof of Theorem 3.1. Throughout this section, 
we fix a small constant e € (0,1/2) and a sufficiently large constant c > 1. Moreover, we set 
a •(— (1 + e)/(1 — e), L 2 + |"log( 1+e ) n]. 

First, we show that we can construct a (a, d, L)-decomposition by sampling 0(n) edges. 
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Lemma 3.2. Fix an integer d > 0, and let S be a collection of cm(L — 1) log n/d mutually in¬ 
dependent random samples (each consisting of one edge) from the edge set E of the input graph 
G = ( V, E). With high probability we can construct from S an (a, d , L)-decomposition (Z i,..., Zl) 
of G, using only 0((n + m/d)) bits of space. 

Proof. We partition the samples in S evenly among ( L — 1) groups {S}} ,i € [L — 1]. Thus, each 
Si is a collection of cm log n/d mutually independent random samples from the edge set E, and, 
furthermore, the collections {Si} ,i € [L — 1], themselves are mutually independent. 

Consider any index i G {1,..., L — 1}. Note that an edge (u, v) € E can appear multiple times 
in the collection of samples Si. We will slightly abuse the notation introduced in the beginning of 
Section 2, and let D v (V', Si) denote the degree of a node v G V' C V in the multigraph induced by 
the node set W and the samples in S). With this notation in hand, our algorithm works as follows. 

• Set Zi «- V. 

• For i = 1 to (L — 1): Set Zi + \ <— {v G Zi : D v (Zi,Si ) > (1 — e)aclogn}. 

To analyze the correctness of the algorithm, define the (random) sets Ai = {v € Z{ : D v (Zi,E) > 
ad} and Bi = {v € Zi : D v (Zi,E) < d} for all i € [L — 1]. Note that for all i € [L — 1], the 
random sets Zi,Ai,Bi are completely determined by the outcomes of the samples in {Sj} ,j < i. 
In particular, the samples in Si are chosen independently of the sets Zi,Ai,Bi. Let Si be the event 
that (a) |_i D Ai and (b) Z l+ \ n Bi = 0. By Definition 2.5, the output (Z\, ... ,Zjf) is a valid 

(a, d, L)-decomposition of G iff the event occurs. Consider any i G [L — 1], Below, we show 

that the event £ t occurs with high probability. The lemma follows by taking a union bound over 
all i € [L — 1], 

Fix any instantiation of the random set Z t . Condition on this event, and note that this event 
completely determines the sets Ai,Bi. Consider any node v € A{. Let X v ^(j) e {0,1} be an 
indicator random variable for the event that the j th sample in Si is of the form (u,v), with u € 
M v {Zi). Note that the random variables {X v j(j)}, j, are mutually independent. Furthermore, we 
have E[X Vt i(j)\Zi\ = D v {Zi)/m > ad/m for all j. Since there are cmlogn/d such samples in Si, 
by linearity of expectation we get: E[D v {Zi, S})|Zj] = E[X Vt i(j)\Zi\ > ( cmlogn/d ) • (ad/m) = 
ceclogn. The node v is included in Zi + \ iff D v (Zi,Si) > (1 — e)aclogn, and this event, in turn, 
occurs with high probability (by Chernoff bound). Taking a union bound over all nodes v € Ai, 
we conclude that Pr[Zj_|_i D Ai \ Z/\ > 1 — l/(poly n). Using a similar line of reasoning, we get 
that Pr[Z ?;+1 n Bi = 0 | Z/\ > 1 — l/(poly n). Invoking a union bound over these two events, we get 
Pr[£j | Zi] > 1 — l/(poly n ). Since this holds for all possible instantiations of Zi, the event Sj itself 
occurs with high probability. 

The space requirement of the algorithm, ignoring poly log factors, is proportional to the number 
of samples in S (which is cm(L — 1) log n/d) plus the number of nodes in V (which is n). Since c is a 
constant and since L = 0(1), we derive that the total space requirement is 0((n + m/d) poly log n). 

□ 

Now, to turn Lemma 3.2 into a streaming algorithm, we simply have to invoke Lemma 2.8 which 
follows from a well-known result about £o~ sam pli n g in the streaming model [24], and a simple (but 
important) observation in Lemma 2.3. 

Proof of Theorem 3.1. Define A* = 2 a ■ (cn(L — l)logn) and K* = 2 + [log( 1+e )(8cm 2 )~|. While 
processing the stream of edge insertions/deletions, we simultaneously run A *K* mutually indepen¬ 
dent copies of the f'o-sampler as per Lemma 2.8. Furthermore, we maintain a counter to keep track 
of the number of edges in the graph. Initially, the counter is set to zero. After each edge insertion 
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(resp. deletion), the counter is incremented (resp. decremented) by one. Thus, at the end of the 
stream, we get X*K* mutually independent uniform random samples from the edge set E, and a 
correct estimate of the number of edges in the graph (which is given by m = |i?|). All these steps 
can be implemented in 0(A *K*) = 0(n) space. Note that if m = 0, then clearly the maximum 
density of a subgraph is also zero. Thus, for the rest of the proof, we assume that 1 < m < (!)). 

Next, at the end of the stream, we define n = m/(2an) and a = 2(1 + e)n. By Lemma 2.3, 
we have a ■ it < d* < <r/(2(l + e)). Thus, the values of 7r and a satisfies the condition required 
by Corollary 2.7. Hence, following Corollary 2.7, we set K = 2+ |~log( 1+e )(<7/7r)~|, and discretize 
the range [ 7 r,< 7 ] in powers of (1 + e) by defining dj, = (1 + e)^'” 1 • 7r for every integer k € [K], 
Furthermore, we define A& = cm(L — 1) log n/dk for all k € [K]. Our goal is to construct an 
(a,dk,L )-decomposition of G for every k € [K], By Lemma 3.2, for this we need UfcLi many 
mutually independent random samples from E. But note that: 

Afc > Ai = cm(L — 1) logn/di = cm(L — 1) logn/7r = 2 a ■ (c(L — l)nlogn) = A* for all k £ [K]. 

Next, since ir = m/(2an), a = 2(1 + e)n and m > 1, we also have K* = 2 + |"log( 1+e )(8cm 2 )] > 
2 + |"log(i_|_ e ) (cr/7r)~| = K. To summarize, we infer the following guarantee. 

K 

^2 Aft- < X*K*. 

k =1 

In other words, while processing the stream of edge insertions/deletions, we have collected suf¬ 
ficiently many mutually independent random samples from E so as to construct an (a,dk,L)- 
decomposition for every integer k £ [K\. We can now get a (2a + 0(e)) = (2 +0(e)^approximation 
to the value of the densest subgraph by invoking Corollary 2.7. □ 

Remark on the update and query times. The semi-streaming algorithm presented in this 
section runs 0(n) mutually independent ^o- sam pl ers as P er Lemma 2.8. Thus, when an edge is 
inserted into (resp. deleted from) the graph, the algorithm has to update each of these 4)-samplers. 
This requires an update time of 0(n). Finally, to answer a query about the maximum density, 
we have to first construct 0(1) many (a, d, L ) decompositions (for different values of d) from the 
sampled edges maintained by the samplers, and then invoke Corollary 2.7. Thus, the query time 
is also 0(n). 

4 A Dynamic Algorithm with Fast Update and Query Times 

The algorithm in Section 3 maintains a (2 + e)-approximation to the value of the densest subgraph 
and requires only 0(n ) space, but its update and query times can be as large as P(n). In this 
section, we present an algorithm with 0(1) update and query times. This algorithm, however, has 
to store all the edges in the graph and hence has a space requirement of 0(m + n). Furthermore, 
this algorithm maintains an approximation guarantee of (4 + e). 

Throughout this section, we set ir = l/(4n), a = 4 n, L = 2+ |"log( 1+e ) n], and a = 2 + 3e, where 
e £ (0,1) is some small constant. Note that a ■ ir < d* < <r/(2(l + e)), where d* is the optimal 
density in the input graph. As in Corollary 2.7, we discretize the range [7r, u] in powers of (1 + e) 
by defining the values {dk},k £ [ K ], by setting K = 2 + [log^ 1+e )(<7/7r)]. 

We show in Section 4.1 how to maintain an (a, dk, L)-decomposition of G for each k £ [K] 
in 0(L/e) = 0(logn/e 2 ) amortized update time and 0(m + n) space (see Theorem 4.2). Since 
K = 0(logn/e), the total update time for all the K decompositions is 0(iLlogn/e 2 ) = 0(1) and 
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the total space requirement is also 0{K ■ (m + n)) = 0{m + n). By Corollary 2.7, this gives a 
2a(l + e) 2 = 4 + 0(e)-approximation to the optimal density in G, for sufficiently small e. To answer 
a query about the value of the densest subgraph, the algorithm needs to keep track of the index 
k! as defined in Corollary 2.7. Since there are 0(K ) = 0(1) decompositions to deal with, the time 
taken for this operation can be subsumed within the 0(1) update time. This gives us a query time 
of 0(1). We thus get the main result of this section, which is summarized below. 

Theorem 4.1. There is a deterministic dynamic algorithm that maintains a (4+0(e))- approximation 
to the value of the densest subgraph. The algorithm requires 0(m+n) space, has an amortized update 
time of 0(1) and a query time of 0(1). 

4.1 Dynamically maintaining an (a, d, L)-decomposition 

We present a deterministic data structure that is initially given a, d, L, and a graph G = (V, E) with 
\V\ = n, E = 0. The data structure maintains an (a, d, ^-decomposition of the graph G = (V,E) 
at each time-step, and supports the following operations: 

• Insert(u,v): Insert the edge {u, v) into the graph. 

• Delete (u,v): Delete the edge (u,v) from the graph. 

Theorem 4.2 summarizes our result. We devote the rest of Section 4.1 to its proof. 

Theorem 4.2. For every polynomially bounded a > 2 + 3e, we can deterministically maintain an 
(a, d, L)-decomposition of G = (V,E). Starting from an empty graph, our data structure handles a 
sequence oft update operations (edge insertions/deletions) in total time 0(tL/e). Thus, we get an 
amortized update time ofO(L/e). The space complexity of the data structure at a given time-step 
is 0(n + m), where m = \E\ denotes the number of edges in the input graph at that time-step. 

Data Structures. We use the following data structures. 

1. Every node v € V maintains L lists Friends*[u], for i € {1,...,L}. For i < l(v), the list 
Friends*[u] consists of the neighbors of v that are at level i (given by the set M v (V t )). For 
i = l(v), the set Friends* [u] consists of the neighbors of v that are at level i or above (given 
by the set Af v (Z *)). For i > l(v), the list Friends*[u] is empty. Each list is stored in a doubly 
linked list together with its size, Count* [w]. Using appropriate pointers, we can insert or 
delete a given node to or from a concerned list in constant time. 

2. The counter Level[u] keeps track of the level of the node v. 

Algorithm. If a node violates one of the conditions of an (a, d, L)-decomposition (see Defini¬ 
tion 2.5), then we call the node “dirty”, else the node is called “clean”. Specifically a node y at 
level £(y) = i is dirty iff either D y (Zi ) > ad or D y (Zi_ i) < d. Initially, the input graph G = (V,E) 
is empty, every node v € V is at level 1, and every node is clean. 

When an edge ( u , v ) is inserted/deleted, we first update the Friends lists of u and v by adding 
or removing neighbors in constant time. Next we check whether u or v are dirty. If so, we run 
the RECOVERQ procedure described in Figure 1. Note that a single iteration of the While loop 
(Steps 01-05) may change the status of some more nodes from clean to dirty (or vice versa). If and 
when the procedure terminates, however, every node is clean by definition. 

Analyzing the space complexity. Since each edge in G appears in two linked lists (corre¬ 
sponding to each of its endpoints), the space complexity of the data structure is 0(n + m), where 
m = \E\. 
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01. While there exists a dirty node y 

02. If D y (Z e(y) ) > ad and £{y) < L, Then 

03. Increment the level of y by setting £(y) <— £(y) + 1. 

04. Else if D y (Z^_ 1 ) < d and £(y) > 1, Then 

05. Decrement the level of y by setting £(y) <— £(y) — 1. 


Figure 1: RECOVERQ. 


Analysis of the Update Time. Each update operation takes constant time plus the time for 
the RECOVER() procedure. We show below that the total time spent in procedure RECOVERQ 
during t update operations is 0(tL/e). 

Potential Function. To determine the amortized update time we use a potential function B. Let 
f(u,v) = 1 if l(u) = l(v) and let it be 0 otherwise. We define B and the node and edge potentials 
<h(u) and T(u, v) as follows. 


B = ^ 4>Q;) + ][>(e) ( 1 ) 

vev eeE 

l t(v)-l 

<h(u) = - ^ max(0 ,ad — D v {Zi)) for all nodes v G V (2) 

i =1 

'&(u,v) = 2(L — mm(£(u), £(v))) + f (u, v) for all edges (u, v) € E (3) 

It is easy to check that all these potentials are nonnegative, and that they are uniquely defined 
by the partition Vf,..., Vl of the set of nodes V. Initially, the input graph G is empty and the 
total potential B is zero. We show that (a) insertion/deletion of an edge (excluding subroutine 
RECOVERQ) increases the total potential by at most 3L/e, and (b) for each unit of computa¬ 
tion performed by procedure RECOVERQ in Figure 1, the total potential decreases by at least 
11(1). Since the total potential remains always nonnegative, these two conditions together imply 
an amortized update time of 0(L/e). 

Insertion. The insertion of edge (u,v) creates a new potential T(u, v) with value at most 3 L. 
Further, the potentials <1 ?(u) and 4>(u) do not increase, and the potentials associated with all other 
nodes and edges remain unchanged. Thus, the net increase in the potential B is at most 3 L. 

Deletion. The deletion of edge (u,v) destroys the (nonnegative) potential T(rt, v). Further, each 
of the potentials &(u) and ^(u) increases by at most L/e, and the potentials of all other nodes and 
edges remain unchanged. Thus, the net increase in the potential B is at most 2L/e. 

It remains to relate the change in the potential B with the amount of computation performed. See 
Section 4.1.2. For ease of exposition, we first describe a high level overview of the analysis. 

4.1.1 A high level overview of the potential function based analysis 

The intuition behind this potential function is as follows. We maintain a data structure so that the 
change of the level of a node y from i to i + 1 or from i to i — 1 takes time 0(1 + D y (Zi)). Ignoring 
the constant factor (as we can multiply the potential function by this constant), we assume in 
the following that the cost is simply 1 + D y {Zi). The basic idea is that the insertion or deletion 
of an edge should increase the potential function in order to pay for all future level changes. To 
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implement this idea (1) each node gets a potential that increases when an adjacent edge is deleted 
and that will pay for future level decreases of the node, and (2) each edge in G gets a potential 
that pays for future level increases of its end points. We explain how we implement this in more 
detail next: We know that when a node moves up to level i + 1 it has degree at least ad to nodes 
in Zi , while when it moves back down it has degree at most d to nodes in Z*. Assuming that the 
drop in the nodes degree was caused by the deletion of adjacent edges, the difference of the two, i.e. 
(a — 1 )d has to be used to pay for the cost of a level decreases of a node, which is 1 + D y (Zi ) < d. 
This is possible if we set a > 2. The value of a can even be reduced by multiplying the potential 
of each node by 1/e. Then the drop in potential is ( a — 1 )d/e while the cost is only of d. 

There is, however, an additional complication in this scheme, which forces us to set a = 2+©(e): 
A node on level i might not only decrease its level because of edge deletions (of edges to nodes on 
level i or higher), but also if a node on level i moves down to level i — 1. Said differently, the drop 
of ( a — 1 )d/e of the degree of a node y on level i might not only be caused by edge deletions, but 
also by the level drop of incident nodes. Thus, when the level of a node y decreases, the potential 
of all its neighbors on a larger level has to increase by 1/e to pay for their future level decrease. 
Thus the drop of the potential of y by (a — 1 )d/e has to “pay” for the increase of the potential 
of its neighbors, which is in total at most d/e, and the cost of the operation, which is d. This is 
possible if we set a = 2 + e. 

4.1.2 Analyzing the subroutine RECOVER0. 

We will analyze any single iteration of the While loop in Figure 1. During this iteration, a dirty 
node y either increments its level by one unit, or decrements its level by one unit. Accordingly, we 
consider two possible events. 

Event 1: A dirty node y changes its level from i to (< + !)■ 

First, we upper bound the amount of computation performed during this event. Our algorithm 
scans through the list Friends* \y\ and identifies the neighbors of y that are at level (i + 1) or above. 
For every such node x € M y H Z* + i, we need to (a) remove y from the list Friends* [x] and add 
y to the list Friends* + i[x], (b) increment the counter Count* + i [x] by one unit, (c) add x to the 
list FriendSj+i [y\ and remove x from the list Friends* [y], (d) decrement the counter Count* [y] 
by one unit and increment the counter CoUNTj + i[y] by one unit. Finally, we set LEVEL[j/] •(— i + 1. 
Overall, 0(1 + D y (Z*)) units of computation are performed during this event. 

Next, we lower bound the net decrease in the B due to this event. We first discuss the node 
potentials. 

• (a) Since the node y gets promoted to level i + 1, we must have D y (Zi ) > ad, which implies 
that max(0, ad — D y (Zi)) = 0, so that the potential <1>(y) does not change. 

• (b) The potential of a node x € M y can only decrease. 

• (b) The potential of a node x ^ {y U A f y } does not change. 

Accordingly, the sum Ylv&v ^( u ) does not increase. 

Next we consider the edge potentials. Towards this end, we first consider the edges incident upon 
y. Specifically, consider a node x € My. 

• If l(x) < i, the potential of the edge (x, y) does not change. 
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If £(x) = i. the potential of (x, y) is 2(L — i) + \ before the level change and 2(L — i) afterwards, 
i.e., the potential drops by one. 


• If £(x) =i + 1, the potential of (x,y) is 2 {L — i) before the level change and 2(L— (i + l)) + l = 
2 (L — i) — 1 afterwards, i.e., it drops by one. 

• If £(x) >1 + 1, the potential of (x,y) is 2 (L — i ) before the level change and 2 (L — (i + 1)) 
afterwards, i.e., it drops by two. 

The potentials associated with all other edges remain unchanged. Thus, the sum Yle^E ^( e ) drops 
by at least D y (Zi). 

We infer that the net decrease in the overall potential B is at least D y (Zi). Note that D y (Zi) > 0 
(for otherwise the node y would not have been promoted to level i + 1). It follows that the net 
decrease in B is sufficient to pay for the cost of the computation performed, which, as shown above, 
is 0(1 + Dy(Zi)). 

Event 2: A dirty node y changes its level from level i to {i — 1). 

First, we upper bound the amount of computation performed during this event. Our algorithm 
scans through the nodes in the list Friends* [y]. For each such node x € A f y 0 Z*, we need to (a) 
remove y from the list Friends* [x] and add y to the list Friends*_i[x] and (b) decrement the 
counter Count* [x]. Finally, we need to add all the nodes in Friends* [y] to the list FRiENDS*_i[y], 
make Friends* [y] into an empty list, and set CoUNTER*[y] to zero. Finally, we set Level[?/] +- i— 1. 
Overall, 0(1 + D y (Zi)) units of computation are performed during this event. 

Next, we lower bound the net decrease in the overall potential B due to this event. We first consider 
the changes in the node potentials. 

• (a) Since the node y was demoted to level i — 1, we must have D v (Zi-i) < d. Accordingly, 
the potential <h(y) drops by at least (a — 1) • (d/e) units due to the decrease in £(y). 

• (b) For every neighbor x of y, D x (Zi) decreases by one while D x (Zj) for j ^ i is unchanged. 
The potential function of a node x considers only the D x (Zj) values if j < £(x). Thus, only 
for neighbors x with £(x) > i does the potential function change, specifically it increases by 
at most 1/e. Thus the sum YlxeJ\r y increases by at most D y (Zi + i)/e. Further, note that 
D y (Zi- |_i)/e < Dy(Zi_i)/e < d/e. The last inequality holds since the node y was demoted 
from level i to level (i — 1). 

• The potential <3?(x) remains unchanged for every node x ^ {y} U A f y . 

Thus, the sum < ^( u ) drops by at least (a — 1) • (d/e) — (d/e) = (a — 2) • (d/e). 

We next consider edge potentials. Towards this end, we first consider the edges incident upon y. 
Specifically, consider any node x € Af y . 

• If £(x) < i — 1, the potential of the edge (x,y) does not change. 

• If £(x) = i— 1, the potential of (x, y) is 2(L—(i—l)) before the level change and 2(L—(i— 1)) + 1 
afterwards, i.e., the potential increases by one. 

• If £(x) = i, the potential of (x, y) is 2 (L — i) + 1 before the level change and 2 (L — (i — 1)) = 
2 (L — i) + 2 afterwards, i.e., it increases by one. 
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• If t(x) > i + 1, the potential of (x,y) is 2(L — i ) before the level change and 2 (L — (i — 1)) 
afterwards, i.e., it increases by two. 

The potentials associated with all other edges remain unchanged. Thus, the sum ^ e eE v ^( e ) i n ~ 
creases by at most 2D y (Zi _\) < 2d. 

We infer that the overall potential B drops by at least (a — 2) • (d/e) — 2d = (a — 2 — 2e) • (d/e). 
Accordingly, for a > 2 + 3e this potential drop is at least d > D y (Zi) +1. We conclude that the net 
drop in the overall potential B is again sufficient to pay for the cost of the computation performed. 
This concludes the proof of Theorem 4.2. 

5 A Semi-Streaming Algorithm with Fast Update and Query times 

In Section 3, we presented a semi-streaming algorithm that maintains a (2 + e)-approximation of 
p*(G). Specifically, the algorithm can process a dynamic stream of updates (edge insertions/deletions) 
using only 0(n) bits of space. Unfortunately, however, it has a large update time of 0(n), and it 
answers a query only at the end of the stream (also in time 0(n)). 

On the other hand, in Section 4 we presented an algorithm that maintains a (4+e)-approximation 
of p*(G). This algorithm has the advantage of having very fast (i.e., 0(1)) update and query times. 
Furthermore, it can answer a query at any given time-instant (even in the middle of the stream). 
But, unlike the algorithm from Section 3 whose space complexity is O(n), it has to store all the 
edges and requires 0(m + n) bits of space. 

In this section, we combine the techniques from Sections 3 and 4 to get a result that captures the 
best of both worlds. Specifically, we present a new algorithm that maintains a (4+e)-approximation 
of p*(G) while processing a stream of updates (edge insertions/deletions). We highlight that: 

• The algorithm has very fast (i.e., 0(1)) update and query times. 

• It requires very little (i.e., 0(n)) space. 

• It can answer a query at any time-instant (i.e., even in the middle of the stream). 

5.1 An Overview of Our Result 

We denote the input graph by G = (V,E). It has \V\ = n nodes, and in the beginning of our 
algorithm the graph is empty (i.e., E = 0). Subsequently, our algorithm processes a “stream of 
updates” in the graph. Each update consists of an edge insertion/deletion. Specifically, at each 
“time-step”, either an edge is inserted into the graph or an already existing edge is deleted from the 
graph. The node set of the graph, however, remains unchanged over time. For any integer t > 0, 
we let G ( G = (V,E^) denote the status of the input graph at time-step t (i.e., after the t th edge 
insertion/deletion). Thus, G^ 0) = (V, E^) denotes the status of G in the beginning, which implies 
that E ( 0) = 0. Further, we let rr>/ t '> = \E^\ denote the number of edges in the graph G^\ Finally, 
0 PT lb = p (G®) gives the value of the densest subgraph in G®. We also use the notations and 
concepts introduced in Section 2. Our algorithm will maintain a value Output^ at each time-step 
t. We want Output^ to be a (4 + e)-approximation to Opt^F 
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Throughout Section 5, we fix the symbols e, a , L, c, A and T as defined below. 

e € (0,1) has a sufficiently small positive value. (4) 
a = 2 + 0(e) = 2 + c* ■ e, where c* is some constant independent of e (to be decided later). (5) 

L = 2 + riog ( i+e) "1 (6) 

A > 1 is any positive constant, and c is a constant such that c >> A. (7) 

T=\n x 1 (8) 

We use the symbols 0(.) and 0(.) to hide poly(logn, 1/e) factors. We now state the main result. 

Theorem 5.1. Define T as in equation 8. There is an algorithm that processes a stream of T 
updates (starting from an empty graph), and satisfies the following properties with high probability: 

• It uses only 0(n ) bits of space. 

• The total time taken to process the T edge insertions/deletions is 0(T). Thus, it has an 
amortized update time of 0(1). 

• It maintains a value Output^ such that for all time-steps t € [1,T] we have Output^ < 
Op T M < (4 + e) ■ Opt^. Thus, the algorithm maintains a (4 + e)- approximation to the value 
of the densest subgraph while processing the stream of updates, and the query time is 0(1). 

Note that the algorithm in Theorem 5.1 works only for polynomially many time-steps (since 
T = 0(n A ) and A is a constant). In contrast, we did not impose this restriction while presenting 
our semi-streaming algorithm in Section 3. To see why this is the case, recall that a semi-streaming 
algorithm maintains a “sketch” of the input while processing the stream of edge insertions/deletions. 
For our algorithm in Section 3, the sketch is simply the collection of random samples from the edge 
set of the input graph. Let Sketch^ denote the status of the sketch at time-step t (i.e., it 
corresponds to the graph G'' > ). Now, the following condition holds in Section 3: 

• (PI) Fix any time-step t. With high probability, if we run the procedure in Section 3 on 
Sketch^, then this gives us a good approximation to the value of Opt^. 

A semi-streaming algorithm typically needs to invoke property (PI) only at the end of the stream , 
since it answers a query after processing all the edge insertions/deletions. In this section, however, 
we want an algorithm that can answer a query at any given time-instant (i.e., even in the middle 
of the stream). Thus, we want the stronger property stated below. 

• (P2) The following event holds with high probability. For every time-step t € [1,T], we can 
get a good approximation to Opt^ using Sketch^. 

Intuitively, (P2) follows if we take a union bound over the complement of (PI) for all time-steps 
t € [1,T]. But this can be done only if the length of the interval [1,T] is bounded by some 
polynomial in n. Thus, we require that T = 0(n A ) for some constant A. 

5.1.1 Main technical challenges 

At a very high level, the following approach seems natural for proving Theorem 5.1. First, using 
the techniques from Section 3, maintain 0(n ) uniformly random samples from the edge set of the 
input graph. Next, using the techniques from Section 4, maintain ( a , d, L)-decompositions on these 
randomly sampled edges. The first step should ensure that the algorithm requires only 0(n) bits 
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of space, while the second step should ensure that the algorithm requires 0(1) update and query 
times. Unfortunately, however, to implement this simple idea we need to overcome several intricate 
technical challenges. They are described below. 

1. As stated at the end of Section 3, maintaining the random samples from the edge set E 
requires 0(n) update time, since to process the insertion/deletion of an edge we have to 
update 0(n) many £ 0 -samplers. So the first challenge is to speed up the update time of the 
subroutine that maintains the randomly sampled edges. This is done in Section 5.2. 

2. The algorithm in Section 3 uses one crucial observation that is captured in Lemma 2.3. 
Specifically, the maximum density of a subgraph of G = (V,E) lies in the range [ m/n,n \, 
where m = \E\ and n = \V\. Thus, we set ir = m/(2an) and a = 2(1 + e)n, so that we have 
a ■ it < d* < er/(2(l + e)), where d* = maxgcv p(S) is the value of the maximum density 
of a subgraph of G (see the discussion immediately after the proof of Lemma 3.2). Then 
we discretize the range [7 t,<t] in powers of (1 + e), by defining the values dk,k € [K], as 
per Corollary 2.7. For each k € [K], we construct an (a, dk, -^-decomposition. Finally, we 
approximate the value of d* by looking at the topmost levels (i.e., the node set Vl) of each of 
these decompositions. To implement this approach in Section 3, we wait till the end of the 
stream to get the value of m after all insertions/deletions. This is of crucial importance since 
the degree-threshold d/, ; (as per Corollary 2.7) for the k th (a, d, L)-decomposition depends on 
the value of 7r, which, in turn, depends on m. In this section, however, we have to maintain 
a solution at every time-step in the interval [1,T]. Consequently, we have to maintain an 
(a, d/., L)-decomposition for each k € [K] throughout the interval [1,T]. Hence, the degree- 
threshold dk of the k th decomposition changes over time as edges are inserted/deleted into 
the input graph. Thus, we need to extend the dynamic algorithm from Section 4.1 (which 
maintains an (a, d, L)-decomposition for a fixed d) so that it can handle the scenario where 
d changes over time. See Section 5.5 for further details. 

3. Suppose that we want to construct an (a, d, L)-decomposition using 0(n ) bits of space. In 
Section 3, to achieve this goal we used a collection of sets (S)}, i € {1,..., L — 1}. Recall the 
proof of Lemma 3.2 for details. Specifically, each Si consisted of 0(m log n/d) many uniformly 
random samples from the edge set E. Given the subset of nodes Z{, we used the samples Si 
to construct the next subset Z l+ \ C Z % . Thus, we used different collections of sampled edges 
for different levels of the (a, d, L)-decomposition. The reason behind this was as follows: For 
the proof of Lemma 3.2 to be valid, it was crucial that the samples used for defining the set 
Zi -|_i be chosen independently of the samples used for the sets Z \,..., Zj. This is in sharp 
contrast to our dynamic algorithm in Section 4.1 for maintaining an (ck, d, L)-decomposition: 
that algorithm uses the same edge set E for different levels of the decomposition. Thus, 
we need to find a way to extend the potential function based analysis from Section 4.1 to a 
setting where different levels of the (cc, d, L)-decomposition are concerned with different sets 
of edges (chosen uniformly at random). See Section 5.5 for further details. 

Roadmap for the rest of Section 5. The rest of this section is organized as follows. 

• In Section 5.2, we show how to maintain random samples from the edge set of the input 
graph in 0{n) space and 0(1) update time. For technical reasons, we have to run two 
separate algorithms for this purpose. Roughly speaking, the first algorithm (as stated in 
Theorem 5.2) maintains the entire edge set of the graph whenever m® = 0{n ), whereas the 
second algorithm (as stated in Theorem 5.3) maintains 0(n) random samples from the edge 
set of the graph whenever m® = Cl(n). 
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• In Section 5.3, we present a high level overview of our main algorithm. The main idea is to 
classify each time-step as either “dense” or “sparse”, depending on the number of edges in 
the input graph. This classification is done in such a way that Theorem 5.2 applies to all 
sparse time-steps, whereas Theorem 5.3 applies to all dense time-steps. 

• In Section 5.4, we present our algorithm for maintaining a (4 + e)-approximation to Opt® 
during all the sparse time-steps (see Theorem 5.4). This algorithm takes as input the set of 
edges maintained by the subroutine from Theorem 5.2. 

• In Section 5.5, we present our algorithm for maintaining a (4 + e)-approximation to Opt® 
during all the dense time-steps (see Theorem 5.5). This algorithm takes as input the set of 
edges maintained by the subroutine from Theorem 5.3. 

• Theorem 5.1 follows from Theorem 5.2 and Theorem 5.3. 

5.2 Maintaining the randomly sampled edges in 0(1) update time 

Intuitively, we want to maintain s uniformly random samples from the edge-set E of the input 
graph G = (V,E), for some s = 0(n). In Section 3, we achieved this by running 0(n) mutually 
independent copies of the ^-sampler. This ensured a space complexity of 0(n). However, after 
each edge insertion/deletion in the input graph, we had to update each of the f'o~ sam pl ers - So the 
update time of our algorithm became 0(n). In this section, we show how to bring down this update 
time (for maintaining the randomly sampled edges) to 0(1) without compromising on the space 
complexity. 

To see the high level idea behind our approach, suppose that the input graph has a large number 
of edges, i.e., s <C m = \E\. We maintain s “buckets” B\, ..., B s . Whenever an edge e is inserted 
into the input graph, we insert the edge into a bucket chosen uniformly at random. When the edge 
gets deleted from the input graph, we also delete it from the bucket it was assigned to. Thus, the 
buckets B\,... ,B S give a random partition of the edge set of the input graph G = (V, E). We can 
maintain this partition using an appropriate hash function. Now, we run s mutually independent 
copies of the sampler, one for each bucket Bi,i € {1,..., s}. Let S be the collection of edges 
returned by these ^-samplers. Thus, we have S C. E, \S\ = s, and any given edge e € E belongs to 
S with a probability that is very close to s/m. In other words, the set S' is a set of s edges chosen 
uniformly at random from the edge set E (without replacement). This solves our problem. The 
space requirement is 0(s) = 0(n) since we run s copies of the fo-sampler and each of this samplers 
needs 0(1) space. Furthermore, unlike our algorithm in Section 3, here if an edge insertion/deletion 
takes place in the input graph G, then we only need to update a single £q sampler (the one running 
on the bucket that edge was assigned to). This improves the update time to 0(1). 

To be more specific, we present two results in this section. In Theorem 5.2, we show how to 
maintain the edge set of the input graph G = (V,E) at all time-steps where it is “sparse” (i.e., 
m = \E\ is small). Next, in Theorem 5.3, we show how to maintain 0(n) uniformly random samples 
(without replacement) from the edge set E at all time-steps where the input graph is “dense” 
(i.e., m = \E\ is large). The proofs of Theorems 5.2 and 5.3 appear in Sections 5.2.1 and 5.2.2 
respectively. Both the proofs crucially use well known results on £o- sam pk n g i n a streaming setting 
(see Lemma 2.8) and re-wise independent hash functions (see Lemma 2.9). 

Theorem 5.2. Starting from an empty graph, we can process the first T updates in a dynamic 
stream so as to maintain a random subset of edges _F® C FI® at each time-step t < T. This 
requires 0(n ) space and 0(1) worst case update time. Furthermore, the following conditions hold 
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with high probability. 

F < ' t) = E^ at each time-step t<T with m® < 8ac 2 nlog 2 n. (9) 

Note that the proof of Theorem 5.2 is by no means obvious, for the following reason. It might 
happen that we have not included an edge e from the input graph in our sample (since the graph 
currently contains many edges). However, with the passage of time, a lot of edges get deleted from 
the graph so as to make it sparse, and at that time we might need to recover the edge e. 

Theorem 5.3. Fix any positive integer s < 2acnlogn. Starting from an empty graph, we can 
process the first T updates in a dynamic stream so as to maintain a random subset of edges C 
at each time-step t £ [T], This requires O(n) space and 0(1) worst case update time. For 
every edge e € E^\ let Xp' 1 £ {0,1} be an indicator random variable that is set to one iff e € 

Then we have: 


1. The following condition holds with high probability. 


E 


X® 


(lie) 


rri 


(t) 


for all edges e € E^\ at each t <T with rP l> > 4ac 2 nlog 2 n. 

( 10 ) 


2. At each time-step t <T, the random variables {xP},e € EP, are negatively associated. 

3. Insertion/deletion of an edge in G leads to at most two insertion/deletions in the set S. 


5.2.1 Proof of Theorem 5.2 

The algorithm. We define w = q = 8ac 2 nlog 2 n, and build a re-wise independent hash function 
h : E* —>• [re] as per Lemma 2.9. This requires O(n) space, and the hash value h(e) for any given 
e £ E* can be evaluated in 0(1) time. For all t € [T] and i £ [w], let bP denote the set of edges 
e € with h(e) = i. So the edge set E W is partitioned into w random subsets Bp ,..., B$. 

As per Lemma 2.8, for each i € [re] we run r = c 2 log 2 n copies of a subroutine called 
Streaming-Sampler. Specifically, for every i £ [re] and j £ [r], the subroutine STREAMING- 
SAMPLER^, j) maintains a uniformly random sample from the set bP in 0(1) space and 0(1) worst 
case update time. Furthermore, the subroutines {Streaming-Samplers^, j)}, i £ [w\,j € [r], use 
mutually independent random bits. Let Y' f l denote the collection of the random samples main¬ 
tained by all these Streaming-Samplers. Since we have multiple Streaming-Samplers running 
on the same set bP, i € [re], a given edge can occur multiple times in Y^\ We define F^ C E^ to 
be the collection of those edges in E *0 that appear at least once in Y^\ Our algorithm maintains 
the subset F® at each time-step t € T. 

Update time. Suppose that an edge e is inserted into (resp. deleted from) the graph G = ( V ., E) 
at time-step t G [T\. To handle this edge insertion (resp. deletion), we first compute the value 
i = h(e) in constant time. Then we insert (resp. delete) the edge to the set Bp , and call the 
subroutines Streaming-Sampler(z, j) for all j € [r] so that they can accordingly update the 
random samples maintained by them. Each Streaming-Sampler takes 0(1) time in the worst 
case to handle an update. Since r = 0(log 2 n), the total time taken by our algorithm to handle an 
edge insertion/deletion is 0(r poly log n) = 0(1). 

Space complexity. We need 0(1) space to implement each Streaming-Sampler^, j), i € 
[■ w\,j £ [r]. Since w = 0(nlog 2 n) and r = 0(log 2 n), the total space required by all the streaming 
samplers is 0(wr poly log n) = 0(n). Next, note that we can construct the hash function h using 
0(n) space. These observations imply that the total space requirement of our scheme is 0(n). 
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Correctness. It remains to show that with high probability, at each time-step t € [T] with 
m® < 8ac 2 n log 2 n, we have F® = 


Fix any time-step t € [T] with m® < 8ac 2 n log 2 n. Consider any i € [re]. The probability 
that any given edge e € E^ has h(e) = i is equal to 1/w. Since w = 8ac 2 nlog 2 n, the linearity of 

= 77i /w < 1. Since the hash function h is re-wise independent, 


expectation implies that E 


I B 


Mi 


and since < w, we can apply the Chernoff bound and infer that \B^\ < clogn with high 

probability. Now, a union bound over all i € [tc] shows that with high probability, we have 

\B^\ < clog 77 for all i € [tc]. Let us call this event £^\ 

Condition on the event £^. Fix any edge e € E^\ Let h(e) = i, for some i € [re]. We 

know that e € that there are at most clogn edges in Bf\ and that our algorithm runs 

r = c 2 log 2 n many Streaming-Samplers on Bf\ Each such Streaming-Sampler maintains 
(independently of others) a uniformly random sample from Bf\ Consider the event where the 
edge e is not picked in any of these random samples. This event occurs with probability at most 
(1 - l/(clog n )) c2 log2 n < l/n c . 

In other words, conditioned on the event £^\ an edge e € E® appears in with high 
probability. Taking a union bound over all e € E®, we infer that F® = E^ with high probability, 
conditioned on the event £^\ Next, we recall that the event £^ itself occurs with high probability. 
Thus, we get that the event F ^ = E® also occurs with high probability. To conclude the proof, 
we take a union bound over all time-steps t S [T] with m® < 8ac 2 nlog 2 n. 


5.2.2 Proof of Theorem 5.3 

We define w = 2cslogn, q = s, and build a w- wise independent hash function h : E* —>• [s] as 
per Lemma 2.9. This requires 0 ( 77 ) space, and the hash value h(e) for any given e € E* can be 
evaluated in 0(1) time. 

This hash function partitions the edge set E^ into s mutually disjoint buckets {B^},j € [s], 

where the bucket B^‘ consists of those edges e € E^ with h(e) = j. For each j € [s], we run an 
independent copy of £o~Sampler, as per Lemma 2.8, that maintains a uniformly random sample 
from Bj. The set consists of the collection of outputs of all these ^-Samplers. Note that 
(a) for each e € E*, the hash value h(e) can be evaluated in constant time [37], (b) an edge 
insertion/deletion affects exactly one of the buckets, and (c) the ^-Sampler of the affected bucket 
can be updated in 0(1) time. Thus, we infer that this procedure handles an edge insertion/deletion 
in the input graph in 0(1) time, and furthermore, since s = 0(n), the procedure can be implemented 
in O(n) space. We now show that this algorithm satisfies the three properties stated in Theorem 5.3. 

1. Fix any time-step t € [1,T] where > 4ac 2 n log 2 n. Since s < 2acnlogn, we infer that 
777.0) = \e^\ > 2cs\ogn. Hence, we can partition (purely as a thought experiment) the 
edges in E® into at most polynomially many groups | B^p |; i n such a way that the size 
of each group lies between cslogn and 2cslogn. Thus, for any j € [s] and any j', we have 
\Hy FI Bj\ € [clogn, 2clogn] in expectation. Since the hash function h is (2cs log n)-wise 
independent, by applying a Chernoff bound we infer that with high probability, the value 
| H^) n B^p | is within a (1 ± e) factor of its expectation. Applying the union bound over all 

j,j', we infer that with high probability, the sizes of all the sets | n B^ | are within a 
(lie) factor of their expected values - let us call this event . Note that E[|H^|] = / s 
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and \B? ) \='£ i ,\Bf ) nH®\ 


Hence, under the event H®, for all j G [s] the quantity |B 


Ml 


is within a (lie) factor of m^'/s. Under the same event TZ^\ due to the ^-Samplers, the 
probability that a given edge e G E® with h(e) = j (say) becomes part of S® is within a 
(lie) factor of 1/|.B- |, which, in turn, is within a (lie) factor of s/m®. This implies that 


for any given edge e € E^\ we have E 
probability. 


X, 


it) 


= Pr[e€SW] € [(1 ± e) • a/ with high 


2. The property of negative association follows from the facts that (a) if two edges are hashed to 
different buckets, then they are included in sf ' 1 in a mutually independent manner, and (b) 
if they are hashed to the same bucket, then they are never simultaneously included in 

3. Finally, when an edge e is inserted into (resp. deleted from) the input graph, only the £q- 
sampler running on the bucket Bj, for j = h(e) gets affected. This implies that a single edge 
insertion/deletion in the input graph leads to at most two edge insertions/deletions in the 
random subset of edges S C E. 


5.3 A high level overview of our algorithm: Sparse and Dense Intervals 

Our algorithm for Theorem 5.1 will consist of four different components. They are described below. 

• (PI) This subroutine implements an algorithm as per Theorem 5.2. 

• (P2) This subroutine implements 0(1) independent copies of the algorithm in Theorem 5.3. 

• (P3) For each t G [1,T], this subroutine classifies time-step t as either “dense” or “sparse”. 
This is done on the fly, i.e., immediately after receiving the t th edge insertion/deletion in 
G. Consequently, the range [1,T] is partitioned into “dense” and “sparse” “intervals”, where 
a dense (resp. sparse) interval is a maximal and contiguous block of dense (resp. sparse) 
time-steps. For example, we say that [to,ti] C [1,T] is a dense interval iff (a) time-step t 
is dense for all t E [io,ti], (b) either to = 1 or time-step (to — 1) is sparse, and (c) either 
ti = T or time-step (ti + 1) is sparse. The sparse time-intervals are defined analogously. The 
subroutine ensures the following properties. 

1. We have m® < 8ac 2 nlog 2 n for every sparse time-step t G [1,T]. In other words, the 
input graph has a small number of edges in a sparse time-step. Note that 8ac 2 nlog 2 n 
is also the threshold used in Theorem 5.2. Thus, with high probability, equation 9 holds 
at all sparse time-steps. 

2. We have > 4ac 2 nlog 2 n for every dense time-step t G [1,T]. In other words, the 
input graph has a large number of edges in a dense time-step. Note that 4ac 2 nlog 2 n is 
also the threshold used in Theorem 5.3 (part 1). Thus, with high probability, equation 10 
holds at all dense time-steps. 

3. If a dense interval begins at a time-step f, then we have = 1 + 8ac 2 nlog 2 n. 

4. Every dense (resp. sparse) interval spans at least 4ac 2 n log 2 n time-steps, unless it is the 
interval ending at T. 

The classification of a time-step as dense or sparse is done according to the procedure outlined 
in Figure 2. This procedure can be easily implemented in 0(1) space and 0(1) update time, 
since all we need is a counter that keeps track of the number of edges in the input graph 
while processing the stream of updates. Furthermore, it is easy to check that the procedure 
in Figure 2 ensures all the four properties described above. 
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• (P4) This subroutine takes as input the set of edges maintained by (PI). Furthermore, it also 
has access to the output of the subroutine (P3). At all sparse time-steps, it maintains a value 
Output^) that gives a (4 + e)-approximation of Opt™. See Section 5.4 for further details. 

• (P5) This subroutine takes as input the subsets of edges maintained by (P2). Furthermore, 
it also has access to the output of the subroutine (P3). At all dense time-steps, it maintains 
a value Output^ that gives a (4 + e)-approximation of Opt^. See Section 5.5 for further 
details. 

• Theorem 5.1 follows from Theorem 5.4 and Theorem 5.5. 


01. 

The time-step 1 is classified as sparse. 

02. 

For t = 2 to T 

03. 

If time-step (t — 1) was sparse, Then 

04. 

If 777,0) < 8ctc 2 nlog 2 n, Then 

05. 

Classify time-step t as sparse. 

06. 

Else if > 8ac 2 nlog 2 n, Then 

07. 

Classify time-step t as dense. 

08. 

Else if time-step (t — 1) was dense, Then 

09. 

If rnM'l > 4ac 2 nlog 2 n, Then 

10. 

Classify time-step t as dense. 

11. 

Else if < 4ac 2 nlog 2 n, Then 

12. 

Classify time-step t as sparse. 


Figure 2: CLASSIFY-TIME-STEPS(.). 


5.4 Algorithm for sparse intervals 

In this section, we show how to maintain a (4 + e)-approximation to the value of the densest 
subgraph during the sparse time-intervals. Specifically, we prove the following theorem. 

Theorem 5.4. There is an algorithm that uses 0(n) space and maintains a value Output^ at 
every sparse time-step t <T. The algorithm gives the following two guarantees with high probability. 

• Opt^/(4 + e) < Output^ < Opt™ at every sparse time-step t < T. 

• The algorithm takes 0(T) time to process the stream of T updates in G. In other words, the 
amortized update time is 0(1). 

The algorithm for Theorem 5.4 consists of two major ingredients. 

• First, we run a subroutine as per Theorem 5.2 while processing the stream of T updates. 2 This 
ensures that with high probability, we maintain a subset F ^ C E^> such that F W = E® 
at every sparse time-step t < T. The worst case update time and space complexity of this 
subroutine are 0(1) and O(n) respectively. 

2 Note that this is the same subroutine (PI) from Section 5.3. 
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Second, we run our dynamic algorithm - which we refer to as Dynamic-algo - from Section 4 
on the graph (V, .F®) during every sparse time-interval. 3 Since F® = F® throughout the 
duration of such an interval (with high probability), this allows us to maintain an Output® € 
Opt®/( 4 + e), Opt® every sparse time-step t <T. As the input graph has 0(n) edges 
at every sparse time-step, the space complexity of Dynamic- ALGO is 0(n). 


It remains to analyze the amortized update time of Dynamic-ALGO. Towards this end, fix any 
sparse time-interval [£o,fi], and let C(to,ti) denote the amount of computation performed during 
this interval by the subroutine Dynamic-ALGO. Consider two possible cases. 

• Case 1. (t\ < T ) 

In this case, our analysis from Section 4 implies that C(to,h) = 0(n + (t\ — to)). Since 
t\ < T, the subroutine (P3) from Section 5.3 ensures that (t\ — to) = fl(rt). 4 This gives us 
the guarantee that C(to,ti) = 0{t\ — to). 

• Case 2. (t\ = T) 

In this case, the sparse time-interval under consideration ends at T. Thus, if (t\ — to) = o(n), 
then we would have C(to,t\) = 0(n). Else if (t i — t o) = fl(n), then applying a similar 
argument as in Case 1, we get C(to,t\) = 0(t\ — to). 

Let denote the i th sparse time-interval, and let C denote the amount of computation per¬ 

formed by Dynamic-ALGO during the entire time-period [1,T]. Since the sparse time-intervals are 
mutually disjoint, and since there can be at most one sparse time-interval that ends at T, we get 
the following guarantee. 

C = (^O{(t\-t i 0 ))^+d{n)<d{T) + d{n) = d{T) (11) 

The last equality holds as T = 0(n A ) and A > 1 (equation 7). This shows that the amortized 
update time of the algorithm is 0(1), thereby concluding the proof of Theorem 5.4. 


5.5 Algorithm for dense intervals 

In this section, we show how to maintain a (4 + e)-approximation to the value of the densest 
subgraph during the dense time-intervals. Specifically, we prove the following theorem. 

Theorem 5.5. There is an algorithm that uses 0(n) space and maintains a value Output® at 
every dense time-step t <T. The algorithm gives the following two guarantees with high probability. 

• Opt®/( 4 + e) < Output® < Opt® at every dense time-step t <T. 

• The algorithm takes 0(T) time to process the stream of T updates in G. In other words, the 
amortized update time is 0(1). 

3 We can identify each sparse time-interval using the subroutine (P3) from Section 5.3. 

4 See item 4 in the description of the subroutine (P3). 
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5.5.1 Basic building block for proving Theorem 5.5 

For every time-step t €E [1,T], we define: 

tt® = m^/(2an), and a = 2(1 + e)n (12) 

Accordingly, Lemma 2.3 implies that: 

a ■ 7T® < Opt® < a/ (2(1 + e)) (13) 

We now discretize the range [7r®, a] in powers of (1 + e) as in Corollary 2.7. Specifically, we define: 

K = 2+ [log (1+e) (c7 ■ (2an))] (14) 

= (1 + e) fc_1 ■ vr® for all k € [AT] (15) 

Note that m® > 1 at every dense time-step t <T. This ensures that 7r® > 1/(2cm), and hence 
we have K >2 + |4og( 1+e )(cr/7r®)] during those time-steps. In other words, the values of ^ t®,ct 
and K satisfy the condition stated in Corollary 2.7 during the dense time-intervals. Also note that 
the value of K does not depend on the time-step t under consideration. 5 Furthermore, we have 
K = 0(1). 

We want to maintain a 2a(l + e) 3 = (4 + 0(e))-approxinration of the maximum density during 
the dense time-intervals. For this purpose it suffices to maintain an (a, d®, L)-decomposition of 
G® for each k € [K] (see Corollary 2.7). 

Thus, we conclude that Theorem 5.5 follows from Theorem 5.6. Accordingly, for the rest of 
Section 5.5, we fix an index k € [ K ], and focus on proving Theorem 5.6. 

Theorem 5.6. Fix any integer k € [K]. There is a dynamic algorithm that uses 0(n) bits of space 
and maintains L subsets of nodes V = Z'f' 1 D ••• D Z^ at every dense time-step t € [T]. The 
algorithm is randomized and gives the following two guarantees with high probability. 

• The tuple (Z^- ... Z^) is an (a, d^,L)-decomposition of G® at every dense time-step t € [T]. 

• The algorithm takes 0(T) time to process the stream of T updates. In other words, the 
algorithm has an amortized update time o/0(l). 

5.5.2 Overview of our algorithm for Theorem 5.6 

Our algorithm for proving Theorem 5.6 consists of two major ingredients. 

• First, at each dense time-step t < T we maintain a collection of (L — 1) random subsets 
of edges S± \... , S^l_ { C E®. To maintain these random subsets, we need to run (L — 1) 
mutually independent copies of the algorithm in Theorem 5.3 (for an appropriate value of s ). 

• Second, using the random subsets C E^\ we maintain an (a, d!^ , L) decom¬ 

position (Z^ \ ..., Z^) during the dense time-steps. Specifically, we set zj f) = V. Next, for 
i = 1 to (L— 1), we construct the node set Z^i Q by looking at the degrees D v {zf\sf" > ) 

of the nodes v € zf^ among the edges e € sf ' 1 . This follows the spirit of the construction in 
Section 3 (but note that there we did not concern ourselves with the update time). 

5 This is especially important since we want to maintain an (a, dff , L)-decomposition for each k € [K] during the 
dense time-intervals. If K were a function of t, then the number of such decompositions would have varied over time. 
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From the proof of Lemma 3.2, we infer that each sf ^ should contain s® = cm W log n/d^ 
random samples from the edge set E W. Equations 12 and 15 ensure that s® = 2 acn log n/(l+ 
e) fc_1 , which means that the value of s^' is independent of the time-step t under consideration. 
Accordingly, for the rest of Section 5.5, we omit the superscript t from and define: 

s = 2ocnlogn/(l + e) kl = cm ^ log n/d® for each dense time-step t <T. (16) 

Furthermore, since k > 1, we observe that: 

s < 2acn\ogn (17) 

Thus, the value assigned to s satisfies the condition dictated by Theorem 5.3. 

To summarize, our first subroutine maintains (L — 1) mutually independent copies of the algorithm 
in Theorem 5.3 (for the value of s as defined by equations 16, 17). The random subsets of edges 
maintained by them are denoted as S^ \..., S^_ l . Since L = 0(1) (see equation 6), Theorem 5.3 
ensures that this subroutine has 0(1) worst case update time and 0(n ) space complexity. 

Next, we present another subroutine Dynamic-stream (see Section 5.5.3) that is invoked only 

during the dense time-intervals. 6 This subroutine can access the random subsets jj , i G [L — 1], 
and it maintains L subsets of nodes V = D ■ ■ ■ D at each dense time-step t G [T], 
Roadmap. The rest of Section 5.5 is organized as follows. 

• In Section 5.5.3, we fix a dense time-interval [to, fi], and describe how the subroutine Dynamic- 
stream processes the edge insertions/deletions in the input graph during this interval. We 
show that Dynamic-stream maintains an (ct, ^-decomposition of the input graph 
throughout the duration of this dense time-interval (see Lemma 5.8). 

• Section 5.5.4 presents the data structures for implementing the subroutine Dynamic-stream. 

• In Section 5.5.5, we make some preliminary observations about the running time of the 
subroutine Dynamic-stream, and analyze its space complexity (see Lemma 5.9). 

• In Section 5.5.6, we analyze the amortized update time of the subroutine Dynamic-stream 
( see Corollary 5.14). Theorem 5.6 follows from Lemma 5.8, Lemma 5.9 and Corollary 5.14. 

• In Sections 5.5.7 and 5.5.8 are devoted to the proof of two lemmas stated in Section 5.5.6. 

5.5.3 The subroutine Dynamic-stream for a dense time-interval [foTi] 

Fix any dense time-interval [to, ^i] [1,T]. We will maintain L subsets of nodes V = z\ L> D • • • D 

Z^ at each time-step t € [to, ti]. With high probability, we will prove that throughout the interval 
the tuple (Z^\ ..., Z^) remains a valid (a, d ®, ^-decomposition of G® = (V,E®). 

We run (L — 1) mutually independent copies of the algorithm stated in Theorem 5.3 (for the 
value of s as defined by equations 16, 17). Hence, at each time-step t € [to,H], we can access the 
mutually independent random subsets of edges sf\ ..., S^_ 1 C E-G as defined in Section 5.5.2. 
Initialization in the beginning of the dense time-interval [to,ti]. 

Just before time-step to, we perform the initialization step outlined in Figure 3. It ensures that 
Z[ t0 ~ 1] = V and Zf°~ l) = 0 for all * G {2, ..., L}. 

6 The dense time-intervals can be easily identified using the subroutine (P3) from Section 5.3. 
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01. Set Zp~ l) <- V. 

02. For * = 2 to L 
03. Set Zp~ l) «- 0. 


Figure 3: INITIALIZE(.). 


Handling an update in G during the dense time-interval [to,ii]- 

Consider an edge insertion/deletion in the input graph during time-step t £ [to,ti]. The edge set 
E® is different from the edge set 1 h Accordingly, for all i € [L — 1], the subset of edges 
may differ from the subset of edges sf 1 \ Therefore, at this time we call the subroutine 
RECOVER-SAMPLE(f) in Figure 4. Its input is the old decomposition (zj f ..., ^). Based 

on this old decomposition, and the new samples |<S^|, i £ [L — 1], the subroutine constructs a 
new decomposition (Zp ,..., zP). 

As in Section 3, we want to ensure that the node set Zj is completely determined by the outcomes 
of the random samples in {Sj}, j < i (see the proof of Lemma 3.2). Towards this end, we observe 
the following lemma. 


01. 

Set 2 

P «- V. 

02. 

For 

i = 1 to L 

03. 


Set Yi «- Zf _1) . 

04. 

For 

i = 1 to (L — 1) 

05. 


Let AP be the set of nodes y £ Zp having D y (Zp,SP) > (1 — e) 2 aclogn. 

06. 


Let BP be the set of nodes y £ ZP having D y (ZP , SP) < (1 + e) 2 clogn. 

07. 


Set li+i <— Fi+i Udj \ 

08. 


For all j = (i + 1) to (L - 1) 

09. 


Set Yj<-Yj\BP. 

10. 


Set Z^P <— Yi + i. 


Figure 4: RECOVER-SAMPLE(f). 


Lemma 5.7. Fix any time-step t £ [to,ti], any index i £ [L — 1], and consider the set Z^p as 
defined by the procedure in Figure 4- 

1. The node set zP is completely determined by the contents of the sets j.S^ j ,j<i. 

2. The contents of the random sets | ,j>i, are independent of the contents of the set zf\ 

Proof. Follows from the description of the procedure in Figure 4. □ 

We now prove the correctness of our algorithm. Specifically, we show that with high probability 
the tuple (z{ f *,..., Zp ) remains a valid (a, dP , ^-decomposition of the input graph throughout 
the dense time-interval under consideration. 

Lemma 5.8. With high probability, at each time-step t £ [to,ti] the tuple ( Z ^ ... Zp ) maintained 
by the subroutine Dynamic-stream gives an (a, d^,L)-decomposition of the input graph & ,| T. 
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Proof. For each time-step t £ [to,ii] and index i £ [L — 1], we define an event £® as follows 
• The event £® occurs iff the following two conditions hold simultaneously. 

- Z® x D {u € Z® : D V (Z®,E®) > ad®}, and 


- Z 


(*) 
i +1 


n{v€Z® :D v (Z®,E^)<d®} = 


We now define another event £& for each time-step t £ 

• £® = nti^- 


At) 


By Definition 2.5, the tuple (Z\ ■ ■ ■ 


Z 


(*)> 


is an (a, L)-decomposition of G® iff the event £® 


occurs. Below, we show that Pr 


?(t) 


> 1 — l/(poly n) for any given i £ {1,... ,L — 1} and 

t £ [to, ti]. Taking a union bound over all i £ {1,..., L — 1}, we get Pr [f > 1 — l/(poly n) at 
each time-step t £ [to,ii]- Hence, the lemma follows by taking a union bound over all t £ [fo,H]- 

For the rest of the proof, fix any time-step t £ [to, H] and index i £ {1,..., L — 1}. 

• Fix any instance of the random set Z® and condition on this event. 

— By Theorem 5.3 and equation 16, each edge e € E® appears in S® with probability 
(1 ± e)s/rn® = (1 ± e)clog n/d®. Furthermore, these events are negatively associated 
(see Section 2.4). 

Consider any node v £ Z® with D v (Z®, E ®) > ad®. By linearity of expectation: 


E 


D V (Z® ,S®) > ad® • (1 - e)clogn/d^ = (1 - e)aclogn. 


Since the contents of the random set S® are independent of the contents of Z® (see 
Lemma 5.7), we can apply a Chernoff bound on this expectation, and derive that: 


W 


Pr 


D V (Z®, S®) > (1 - e) 2 aclog n \ Z® > 1 — l/(poly n) 


(18) 


Now, Figure 4 implies that if D V (Z ®, S®) > (1 — e) 2 ac log n, then the node v becomes 


part of zf\. Thus, applying equation 18 we get: 


Pr 


v € Z® | zf 


> Pr 


D v (zf\s ®) > (1 - efac log n \ Z® >1 — 1/(poly n) (19) 


Note that equation 19 would have been true even if v did not belong to Z ®. Furthermore, 
equation 19 holds regardless of the event £^ t ~ 1 \ 

Next, consider any node u € Z® with D U (Z®, E®) < d®. A similar argument shows: 


Pr 




> Pr 


D U (Z®,E®) < (1 + e) 2 clogn | Z® > 1 - l/(poly n) (20) 


(t) 1 


Note that equation 20 would have been true even if u did not belong to Z®. Furthermore, 
equation 20 holds regardless of the event £ ( - t ^ 1 ' 1 . 

Thus, applying a union bound on equations 19, 20 over all nodes in Z ®, we infer that: 


Pr 


£® | Z® 


> 1 - l/(poly n). 
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Since the guarantee Pr 




> 1 — l/(poly n) holds for every possible instantiation of 


> 


we get Pr 



> 1 — l/(poly n). Taking a union bound over all indices i £ {1,..., L — 1}, we 


infer that Pr [fW] = n[=i* ^ > 1 — l/(poly n). 

In other words, at every time-step t £ [toTi] the event occurs with high probability, and 
this holds regardless of the past events £^'\t' < t. Hence, taking a union bound over all time-steps 
in the interval [io,ii], we get: With high probability, for all t £ [to,ti] the tuple (Z^ ... Z^) 
maintained by the subroutine Dynamic-stream gives an ( a, d , ^-decomposition of the input 
graph G’h). This concludes the proof of the lemma. □ 


5.5.4 Data structures for implementing the procedure in Figure 4 

Recall the notations introduced in Section 2. 

• Consider any node v £ V and any i £ {1 ,... ,L — 1}. We maintain the doubly linked lists 
{Friends* [u,j]} ,1 < j < L — 1 as defined below. Each of these lists is defined by the 
neighborhood of v induced by the sampled edges in S'*. Recall Definition 2.5. 

— If i < £(v), then we have: 

* Friends* [u, j] is empty for all j > i. 

* Friends*[w, j] = N v {Zj, Si) for j = i. 

* Friends* [v,j] = N v {Vj , Si) for all j < i. 

— Else if i > £{y), then we have: 

* Friends* [v,j] is empty for all j > £{y). 

* Friends* |u, j\ = AT v (Zj, Si) for j =£{v). 

* Friends*[u, j] =M v {Vj,Si) for all j < t{v). 

For every node v € V, we maintain a counter Degree* [u], which keeps track of the number of 
nodes in Friends*[u, i]. Note that if £{v) < i, then this counter equals zero. Further, we maintain 
a doubly linked list Dirty-Nodes[*]. This consists of all the nodes v € V having either 
{Degree*[u] > (1 — e) 2 aclogn and £(v) = i} or {Degree*[u] < (1 + e) 2 clogn and £(v) > i}. 

Remark. Note that for any given index i £ {1 ,... ,L — 1} and any time-step t < T, an edge 
e € E® of the input graph appears at most once among the samples in S^ (see Theorem 5.3). 
Thus, the number of occurrences of an edge among the samples S^\ ... , S^f j is at most (L — 1). 

5.5.5 Implementing the procedure in Figure 4 during a dense time-interval [foTi] 

Consider any dense time-interval [foTi] C [1,T], and fix any time-step t € [to,£i]- The t th edge 
insertion/deletion in the input graph might lead to some changes in the random subsets of edges 
S \,..., Sl-i Q E. However, Theorem 5.3 implies that an edge insertion/deletion in G can lead to 
at most two edge insertions/deletions in Si, for alH £ {1,..., L— 1}. Thus, due to the t th update in 
the stream, there can be at most 0(L) = 0(1) insertions/deletions in the random sets Si ,..., Sl~ 1 
(see equation 6). After each such edge insertion/deletion in an 5 *,i £ {1 ,... ,L — 1}, we update 
the relevant data structures described in Section 5.5.4. Since an edge (u, v) £ 5* can potentially 
appear only in the lists Friends* [x,j\ with x € {u, v} and j £ {1,..., L — 1} (see Section 5.5.4), 
we reach the following conclusion: 


30 






• When an edge insertion/deletion in G leads to changes in the random edge sets Si,... ,Sl -i C 
E, we can update the Friends and Dirty-Nodes lists from Section 5.5.4 in 0(1) time. 

After updating the edge sets S ±,..., Sl~i and the Friends and Dirty-Nodes lists, we run the 
procedure described in Figure 4. Now, consider the i th iteration of the main For loop (Steps 05-10) 
in Figure 4, for some index i € {1,..., L — 1}. The purpose of this iteration is to construct the set 
zj+ lt based on the sets zf^ and sf \ Below, we state an alternate way of visualizing this iteration. 

We scan through the list of nodes u with £(u) = i and D u (zf\ S^f 1 ) > (1 — e) 2 aclogra. While 
considering each such node u, we increment its level from i to (i+1). This takes care of the Steps (05) 
and (07). Next, we scan through the list of nodes v with i(v) > i and D v (zf \ sf ^) < (l + e) 2 clog n. 
While considering any such node v at level £{v) = j v > i (say), we decrement its level from j v to i. 
This takes care of the Steps (06), (08) and (09). 

Note that the nodes undergoing a level-change in the preceding paragraph are precisely the ones 
that appear in the list Dirty-Nodes [i] just before the i th iteration of the main For loop. Thus, 
we can implement Steps (05-10) as follows: Scan through the nodes y in Dirty-Nodes[ i] one after 
another. While considering any such node y, change its level as per Figure 4, and then update the 
relevant data structures to reflect this change. 

The next lemma states the space complexity of this procedure. 

Lemma 5.9. Our algorithm in Figure 4 can be implemented in 0(n ) space. 

Proof. The amount of space needed is dominated by the number of edges in , i € [L— 1]. Since 

|sj ,) | < s for each i € [L — 1], the space complexity is (L — 1) • s = 0(n ) (see equations 6, 17). □ 

The claim below bounds the time taken by a single iteration of the main For loop in Figure 4. 
This will be crucially used to analyze the overall update time of our algorithm in Section 5.5.6. 

Claim 5.10. Fix any time-step t € [to,ti], anc ^ consider the i th iteration of the main For loop in 

Figure 4 for some i € {1,..., L — 1}. Consider two nodes u, v € zf' 1 such that: 

• (a) the level of u is increased from i to (i + 1) in Step (07), and 

• (b) the level of v is decreased to i in Steps (08-09). 

Updating the relevant data structures for this step require Yhvyi 0(1 + D y (zf\ sjP)) time, where 
y = u (resp. v) in the former (resp. latter) case. 

Proof. Follows from the fact that we only need to update the lists FriendSj/[x, j\ where i! > i, 
x € {y} U A f y {zf ^, Sf) and j € {i, i + 1}. □ 

5.5.6 The amortized update time of Dynamic-Stream during a dense time-interval 

In this section, we bound the amortized update time of the subroutine Dynamic-stream during a 
dense time-interval [to,ti]- Recall that the subroutine Dynamic-stream maintains an (a,d^,L)- 
decomposition (V^ ,..., vjf ) of the input graph G® throughout the duration of such an interval. 
To bound the amortized update time, we use a potential function B as defined in equation 24. Note 
that the potential B is uniquely determined by the assignment of the nodes v € V to the levels 


31 


{1,..., L} and by the contents of the random sets Si,, 5A_i). For all nodes v € V, we define: 

Tj(?;) = max(0, (1 — e) 2 aclog n — D v (Zi, Si)) for all i € {1,..., £(v) — 1} (21) 

e(v)~i 

*(«) = (L/e) • r i(«) ( 22 ) 

1=1 

For all u,v € V, let f(u,v) = 1 if £(u) = £{v) and 0 otherwise. Also, let r uv = min (£(u),£(v)). For 
all i € {1,..., L — 1}, (u, v ) € Si, we define: 


^i(u,v) = 


0 if r uv > i; 

2 ■ (i — r uv ) + f(u , v) otherwise. 


(23) 


The potential B is defined as the sum of the potentials associated with all the nodes and edges. 

(24) 


(L- 1) 

b = ^2^v)+ 

vdV i= 1 e&Si 


It might be instructive to contrast this potential function with the one used to analyze the dynamic 
algorithm in Section 4. 

Roadmap. Our analysis works in three steps. 


1. In Definition 5.11, we describe an event J-. To understand the intuition behind this definition, 
recall the discussion on the third technical challenge in Section 5.1.1: We have to overcome 
the apparent obstacle that different levels of the (a, d ®, L)-decomposition are constructed 
using different subsets of randomly sampled edges. Intuitively, the event F guarantees that 
the degrees of a node among these different subsets of edges are approximately the same with 
high probability. This helps in extending the ideas from the potential function based analysis 
in Section 4 to the current setting. 

2. In Lemma 5.12, we show that the event T holds with high probability. The proof of 
Lemma 5.12 appears in Section 5.5.7. 

3. Conditioned on the event F, we show that our algorithm has 0(1) amortized update time 
(see Lemma 5.13 and Corollary 5.14). The proof of Lemma 5.13 appears in Section 5.5.8. 

Definition 5.11. Recall the procedure in Figure 4 ■ For all levels i,i! € {1, ... ,L — 1} with i < i', 
and time-steps t € [to ; ii] ; we define an as follows. 

• The event Fff, occurs iff the following conditions are satisfied. 

- {D V (Z^\ S^) > ' (aclogn) for all v € Af 1 }, and 

- {D v {zf\sf) < • clogn for all v € sf } }. 

Now, define the event v Fjf, and the event F = [2t=T' ■ 

Lemma 5.12. The event F holds with high probability. 

Lemma 5.13. Conditioned on the event F, we have: 


32 




• (a) 0 < B = 0(n) at each time-step t € [to,ii]. 

• (b) Insertion/deletion of an edge in G (ignoring the call to the procedure in Figure 4) changes 
the potential B by 0(1). 

• (c) For every constant amount of computation performed while implementing the procedure 
in Figure 4, the potential B drops by 11(1). 

Corollary 5.14. With high probability, subroutine Dynamic-stream spends 0(n + (ti — to)) time 
during the dense time-interval [to,D]- So its amortized update time is 0(1) with high probability. 

Proof. Condition on the event T (which occurs with high probability by Lemma 5.12). At each 
time-step t € [1,T], we maintain the random sets of edges {£■ } as per Theorem 5.3. This takes 
0(1) worst case update time. Further, a single edge insertion/deletion in the input graph leads to 
at most two edge insertions/deletions in each of these sets }. i €. {1, L — 1}. 

Now, using the random sets {5^ }, at each time-step t € [to,H] we maintain an (a, dj, , L)- 
decomposition of the input graph (see Lemma 5.8). Lemma 5.13 implies that with high probability, 
this requires a total update time of 0(n + (t\ — to)) for the entire duration of the interval [fo,ti]. 

Finally, recall that either the dense time-interval spans D(n) time-steps, or it ends at time-step 
T (see the discussion on the subroutine (P3) in Section 5.3). Hence, applying an argument similar to 
the one used in Section 5.4 (see the discussions preceding equation 11), we conclude that with high 
probability the subroutine Dynamic-stream spends 0(T) total time during the first T updates 
in the dynamic stream. In other words, with high probability the subroutine DYNAMIC-STREAM 
has an amortized update time of 0(1). □ 


5.5.7 Proof of Lemma 5.12 

Fix any 1 < i < i' < L — 1, and any time-step t € [to, ti]- 

• Condition on an instantiation of the random set zf \ Note that this also determines the sets 
and (see Figure 4). 

- Let W-^ C z[p be the subset of nodes v with small degrees D v (zf\E^>). Specifically, 
Hf = € Zf : D V (Z?\E®) < • ad «} (25) 

The node set is uniquely determined by the node set zf^ (which we are conditioning 
upon) and the edge set E® (which is given by the stream of updates in the input graph). 


Now, consider any node v € By Lemma 5.7, the contents of the random set S® 

are independent of zf \ By Theorem 5.3, each edge (u,v) € is included in with 
probability (lie) - ( s/m fo). Applying Linearity of expectation and equation 16, we get: 


D v (Z®,sVj\ < (1 + e)-{c\ogn/df)-D v {zf\E^) 


< 


(1 - e ) 2 
(1 + 0 


(cm log n) 


(26) 

(27) 


Equation 27 follows from equations 25 and 26. Next, for each edge (u, v) € E^ incident 
upon the node v, consider the random event that (u,v) € S^\ By Theorem 5.3, these 
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random events are negatively associated (see Section 2.4). Thus, applying Chernoff 
bound on equation 27 and recalling the definition of A^'p from Figure 4, we get: 


Pr 


v £ A, 


W 


= Pr 


D v (zf\sP ) > (1 - e) 2 • (cmlogn) < l/(poly n) 
Applying a union bound on equation 28 over all the nodes in W^\ we get: 


Pr 


W- (<) P| AP / 0 <1/(poly n) 


i (*) 


In other words, with high probability no node in A'P'' 1 belongs to the set A 


(*) 


(28) 

(29) 


We will now bound the degrees of the nodes in zj \ IF) with respect to the random 
edge set SP. Towards this end, consider any node x € zf’ \ W) . By Lemma 5.7, 
the contents of the random set S'-, are independent of z]p. By Theorem 5.3, each edge 
(u,v) € is included in Sp' 1 with probability (lie) - ( s/m W). Applying Linearity of 
expectation and equations 16, 25 we get: 

'D x {Z?\sf)\ > (1 -e)-(c\ogn/dP)-D v (zP,E^) (30) 

(1 - e ) 3 


E 


> 


(cm log n) 


(31) 


(1 + 6)2 

Equation 31 follows from equations 25 and 30. Next, for each edge ( u , x ) € E® incident 
upon the node x , consider the random event that (u, x) € SP. By Theorem 5.3, these 
random events are negatively associated (see Section 2.4). Thus, applying Chernoff 
bound on equation 31, we get: 


Pr 


D X (ZP,SP) < • (cm log n) 


< 1/(poly n) 


(1 + 6)2 

Now, taking an union bound on equation 32 over all the nodes in Z y p \ W+ ; , we get: 


(32) 


Pr 


D x (ZjP , SP) < ^ | € | 2 • (etclog n) for some x € zP \ W.p 


< 1/ (poly n) (33) 


(1 + 6)2 

In other words, with high probability every node x € Zp \ wP has a high degree 
D x (Zp , sP). Now, taking an union bound over equations 29 and 33, we conclude that: 

With high probability, D x (zP. S ( p) > || ^ £ | 2 • (etclog n) for every node x € AP.( 34) 

Using a similar argument for the node set Bp. we can infer that: 

With high probability, D x (zP , SP) < || £ j 9 • (clogn) for every node x € bP. (35) 

Taking an union bound over equations 34 and 35, we infer that: 

Given any instantiation of zP, the event Epl occurs with high probability. (36) 
From equation 36, we infer that: 

(37) 


The event J-fl occurs with high probability. 


The lemma follows by applying an union bound on equation 37 over all indices i, i' € {1,..., L — 1} 
with i < %' and time-steps t € [tojU]- 
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5.5.8 Proof of Lemma 5.13 

Part (a) and part (b) of the lemma hold independently of the event T. It is only during the proof 
of part (c) that we have to condition of the event T. 

Proof of part (a). Fix any time-step t £ [fo,£i]. The proof follows from three facts. 

1. We have 0 < $(v) < (L/e) ■ L ■ (1 — e) 2 aclogn = 0(1) for all v € V. 

2. We have 0 < ^>i(u,v) <3 L = 0(1) for all i € [L — 1], (u,v) € sf\ 

3. We have \S^ \ < s = 0(n ) for all i € [L — 1], This follows from equation 17. 

Proof of part (b). By Theorem 5.3, insertion/deletion of an edge in G leads to at most two 
insertions/deletions in each of the random sets sf \ ... ,S^_ 1 C . Since L = 0(1) (see equa¬ 
tion 6), it suffices to show that for a single edge insertion/deletion in any given 5)- , the potential 
B changes by at most 0(1) (ignoring the call to the procedure in Figure 4). 

Towards this end, fix any i € [L — 1], and suppose that a single edge (u,v) is inserted into 
(resp. deleted from) s!p. For each endpoint x € {it, u}, this changes the potential <h(x) by at most 
0(L/e). The potentials <F(y) for all other nodes y £ V \ {u,v} remain unchanged. Additionally, 
the potential \Iq(ii,i;) £ [0, 3L\ is created (resp. destroyed). Thus, we infer that the absolute value 
of the change in the overall potential B is at most 0(3L + 2 L/e) = 0(1). 

Proof of part (c). Fix any time-step t € [£o,£i], and any iteration of the For loop in Figure 4 
while processing the update in time-step t. Consider two possible events. 

Case 1: A node v € is promoted from level % to level (i + 1) in Step 07 of Figure 4. 

This happens only if v £ A^\ Let C be the amount of computation performed during this step. 
By Claim 5.10, we have: 

(L-i) 

O = £ 0(l + D v {zf\sf)) (38) 

i'=(i+ 1 ) 

Let A be the net decrease in the overall potential B due to this step. We observe that: 

1. Consider any i! > i. For each edge (it, v) £ with u € zf \ the potential T,/(it, u) decreases 
by at least one. For every other edge e £ S^\ the potential 4v(e) remains unchanged. 

2. For each i! £ [i] and each edge e £ S^\ the potential 'lv(e) remains unchanged. 

3. Since the node v is being promoted to level (i + 1), we have D v {zf\ sf ' 1 ) > (1 — e) 2 aclogn. 
Thus, the potential 4>(u) remains unchanged. For each node u ^ v, the potential 4>(it) can 
only decrease (this holds since the degree D u (Zj, Sj), for any level j. can only increase as 
node v increases its level from i to i + 1). 

Taking into account all these observations, we infer the following inequality. 

(L-l) 

A > £ D v {zf\sf) (39) 

i’=(i+l) 
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Since v G Af \ and since we have conditioned on the event J (see Definition 5.11), we get: 

D v (zf , £$ ) > 0 for all *' G [i + 1, L - 1]. (40) 

Eq. (38), (39), (40) imply that the decrease in B is sufficient to pay for the computation performed. 

Case 2: A node v G zf’ is demoted from level j to level i in Steps (08-09) of Figure 4. 

This can happen only if j > z and v G i?- . Let C denote the amount of computation performed 
during this step. By Claim 5.10, we have 

(L-l) 

C = E 0(l + D v {Z?\sf)) (41) 

*'=(*+ 1 ) 

Let 7 = (1 + e) 4 /(l — e) 2 . Equation (42) holds since v G and since we conditioned on the 
event T (see Definition 5.11). Equation (43) follows from equations (41), (42) and since 7 , c are 
constants. 

D v {zf\sjP) < 7 clog n for all i' G [i,L— 1] (42) 

C = 0(L log n) (43) 

Let A be the net decrease in the overall potential B due to this step. We observe that: 

1. By eq. (42), the potential <f>(u) decreases by at least (j — i ) • (L/e) • ((1 — e) 2 a — 7 ) • (clog n). 

2. For u € V \ {u} and i! € [1,*] U [j + 1, L — 1], the potential Lj/(tt) remains unchanged. This 

observation, along with equation (42), implies that the sum ^( u ) increases by at most 

( L A) ' Ei/=( i+ i) A ,{zf\sf) < (j - i) ■ (L/e) • ( 7 c log n). 

3. For every i' G [l,i], and e € S'jP the potential ^^(e) remains unchanged. Next, consider any 
i' G [i + 1,L — 1]. For each edge (u,v) € with u € zf \ the potential 1 Lj/(u, v) increases 
by at most 3(j — i). For every other edge e € S^\ the potential 4v(e) remains unchanged. 
These observations, along with equation (42), imply that the sum Eees., ^*'( e ) increases 

by at most 3 (i - *) • D v {zf\sf) < (.j - i) ■ (3 L) ■ ( 7 clog n). 

Taking into account all these observations, we get: 


A > (j - z)(L/e)((l - e) 2 a - 7 )(clogn) 

~{j - i){L/ e) ( 7 c log n) - ( j - i)( 3 L)( 7 clog n) 

= (j - i) ■ (L/e) ■ ((1 - e) 2 a - 27 - 3 e 7 ) • (clogn) 

> Lc log n 

(44) 


The last inequality holds since (j — i) > 1 and a > (e + (2 + 3e)y)/(I — e) 2 = 2 + 0(e), for some 
sufficiently small constant e G (0,1). From eq. (43) and (44), we conclude that the net decrease in 
the overall potential B is sufficient to pay for the cost of the computation performed. 
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6 Extension to Directed Graphs 

In this section, we extend our dynamic algorithm from Section 4 to directed graphs. The notion of 
“density” of a directed graph was introduced by Kannan et al. [25]. We first recall their definition. 

Definition 6.1. [25] Consider two subsets of nodes X,Y C V in a directed graph G = ( V,E). 
The “density” of this pair equals p(X,Y) = \E(X, T)|/y / |X| |Y|. Here, E(X,Y ) = {(u,v) : u € 
X,v € Y} is the set of edges going from X to Y. The value of the densest subgraph is given by 
p(G) = maxx ,ycv p(X,Y). Note that we do not require the sets X and Y to be mutually disjoint. 

Throughout this section, we denote the input graph by G = (V , E). In the beginning, the input 
graph is empty, i.e., we have E = 0. Subsequently, at each time-step, either a directed edge is 
inserted into the graph, or an already existing edge is deleted from the graph. The set of nodes, on 
the other hand, remains unchanged. Our goal is to maintain a good approximation of p(G) in this 
dynamic setting. The main result is stated below. 

Theorem 6.2. We can deterministically maintain a (8 + e )-approximation to the value of the 
densest subgraph of a directed graph G = ( V,E). The algorithm requires 0(m + n)-space, where m 
(resp. n) denotes the number of nodes (resp. edges) in the graph. Furthermore, the algorithm has 
an amortized update time of 0(1) and a query time of 0(1). 

We devote the rest of Section 6 to the proof of Theorem 6.2. We first define the preliminary 
concepts and notations in Section 6.1. Next, we extend the notion of an (a, d, L)-decomposition 
to directed graphs in Section 6.2. In Section 6.3, we present our main algorithm. Finally, in 
Section 6.4, we combine all these ingredients together and conclude the proof of Theorem 6.2. 

6.1 Notations and Preliminaries 

We first define the notion of a “ derived graph", which will be crucially used in our algorithm. 

Definition 6.3. The “derived graph” G' = (V 7 , E') of the input graph G = (V, E) is constructed as 
follows. For each node v € V. we create two nodes s v and t v . We define the node set V' = S' U T', 
where S' = {s„ : v € V} and T' = {t v : v € V}. Next, for each directed edge (u,v) € E, we create 
the directed edge (s u ,t v ), and define the directed edge set E' = {(s u ,t v ) : (u,v) € E}. Thus, in the 
derived graph G' = (V',E'), each node in S' (resp. T') has zero in-degree (resp. out-degree). 

It is easy to check that the derived graph G' = (V',E') can be maintained in the dynamic 
setting using 0(m + n) space and 0(1) update time: Fix the set of nodes W = S' U T', and 
whenever an edge (u,v) is inserted into (resp. deleted from) G = (V,E), insert (resp. delete) the 
corresponding derived edge (s u ,t v ) in G'. From now on, unless explicitly mentioned otherwise, our 
main algorithm will work on the derived graph G' = (V',E'). Before proceeding any further, we 
introduce some notations that will be used throughout the rest of this section. 

• Consider the derived graph G' = (V',E'). Given any node s € S' and any subset of nodes 
T C T', we let JY S (T) = {t € T : (s,t) G E '} denote the set of neighbors of s among the 
nodes in T. Furthermore, we let D S (T) = |A4(T)| denote the degree of s among the nodes 
in T. For a node t € T' and a subset S C S', the notations J\ft(S) = {(s,t) € E' : s € S} 
and D t (S) = |A/’i(S')| are defined analogously. Next, for any two subsets of nodes S C S' 
and T C T', we let E'(S,T) = {(s,t) € E' : s G S,t € T} denote the set of edges in the 
derived graph that are going from S to T. We also define p'(S,T) = \E'(S,T)\/^/\S\\T\ 
for all S C S',T C T'. Hence, from Definition 6.1, it follows that p'(S,T) = p(S,T). 
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Accordingly, the value of the densest subgraph of the input graph G = ( V , E ) is given by 
p(G) = max5cs',TCT' p'(S,T). We will denote the densest subgraph of G = (V,E) by the 
pair ( S*,T *), where S* C S' and T* C T'. Thus, we have p'(S*,T*) = p(G). Finally, we 
define the parameters As',At > as follows. 

A s , = |T(5*,T*)| • (1 - y/l-l/\S*\) and X T > = \E(S*,T*)\ ■ (1 - y/l-l/\T*\). (45) 


We now state one crucial lemma that will be used in the analysis of our algorithm. This lemma 
was proved by Khuller et al. [29]. For the sake of completeness, we state their proof below. 

Lemma 6.4. [29] Consider the densest subgraph (S*,T*) of the input graph G = (V,E). We have: 

1. D S (T*) > Xs' for all nodes s € S*, and D t (S*) > Xt' for all nodes t € T*. 


2. p(G)/2 < \/Xs' Xt’ < p{G). 
Proof. 


1. For the sake of contradiction, suppose that there exists a node s € S* with D S (T*) < X s'. 
Then we can show that the density of the pair ( S* \ {s}, T*) is strictly larger than the density 
of the pair (5*,T *). Specifically, we can infer the following guarantee: 


p'(S*\{s},T*)> 


\E\S*,T*)\ - Xs' 
-y/(|5*| - 1) • |T*| 


\E\S*,T*)\ 

Vl-S*| • |T*| 


p'(S*,T*) = p(G). 


Since p(G) denotes the value of the densest subgraph of G, the above inequality leads to a 
contradiction. Thus, we conclude that D S (T*) > Xs 1 for all nodes s € S*. Using the same 
line of reasoning, we can conclude that Dt(S*) > A t 1 for all nodes t € T*. 


2. From part I of the lemma, we have the following guarantee: Every node in S* (resp. T*) 
has a degree of at least Ag/ (resp. A t') among the nodes in T* (resp. S*). This implies that 
\E'(S*,T*)\ > 15*| • As' and | E'(S*,T*) > \T*\ ■ X T >. Thus, we get: 

| E'(S*, T*)| > y/\S*\ • |T*| • VAs' • X T > (46) 


Since p{G) = p'(S*,T*) = |E'(5*,T*)|/i/|5*| • |T*|, from equation 46 we conclude that: 

| E'(S*,T*] 


p(G) = 


> xs' ■ x t 1 


V|5*| • I T* 

Next, putting |5*| = 1/sin 2 0\ and |T*| = 1/sin 2 62 , and recalling equation 45, we get: 

As--A t. = I 1 ' • (|S ‘ I ' |r>l) ■ 0 ~ P 1 ~ VIS’l) ■ (l ~ %/l ~ l/|r-|) 

,2 (1 — COS 6 * 1 ) • (1 — COS 6 * 2 ) 


(47) 


?'(5* Tn *^ 12 
]5*I • IT* 

= (p'(S*,T*)y 


sin 2 6 \ sin 2 O 2 


{p'(S*,T*)Y 


> 


4cos 2 (0i/2 )cos 2 (02/2) 
(p'{S*,T*)) 2 _ (p(G))‘ 


4 4 

From equation 48 we conclude that: 


\JX s' • A t' P p{G )/2 

The part II of the lemma follows from equations 47 and 49. 


(48) 

(49) 

□ 
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6.2 (a, ds'-, dx', L)-Decomposition and Its Properties 

We extend the concept of an (a, d, L)-decomposition (see Definition 2.5) to directed graphs. This 
requires us to introduce one additional parameter in the definition, and we call the corresponding 
entity an (a, ds >, dr ’, .^-decomposition. Specifically, an (a, ds>, dx >, .^-decomposition of the derived 
graph G' = (' V',E'), where V 1 = S' U T'. is given by two laminar families of subsets of nodes 
S' = S\ D S 2 D ■ ■ ■ D Sl and T' = T\ D T 2 D ■ ■ ■ D Tl- These subsets are iteratively constructed 
as follows. First, we set S\ = S and T\ = T. Next, for each i £ {1,..., L — 1}, while constructing 
the subsets Sj + i,Tj +1 from the subsets Sj. T t , we ensure the following conditions: 

• Every node s € Si with D S (T ) > a ■ ds> is included in the subset Sj+i- On the other hand, 
every node s € Si with D s (Tf) < ds> is excluded from the subset Sj+i- 

• Every node t £ T* with D t (Si) > a ■ dr 1 is included in the subset T,;+i ■ On the other hand, 
every node i G T, with D t (Si) < dr 1 is excluded from the subset T l+ \. 

We are now ready to formally define the concept of an (a, dgi, d?’, L)-decomposition of G'. 

Definition 6.5. Fix any a > 1, ds^dx’ > 0, and any positive integer L. Consider a family of 

subsets S' = Si D ■ ■ ■ D Sl and T' = Tf D ■ ■ ■ D Tl. The tuples (Si,..., Sl) and (Ti,..., Tl) form 

an (a, ds> ,dx' , L)-decomposition of the derived graph G' iff for every i € {1,..., L — 1} ; we have: 

1■ S i+ i D {s € Si : D s (Ti) > a ■ d S '} and S i+ ± O {s G Sj : D s {Ti) < ^ 5 /} = 0. 

2. T i+ 1 D {t € Ti : D t (Si) > a ■ d T '} and T i+1 n {t € T { : D t (Si) < d T '} = 0- 

Let Vf = (Si U T{) \ (Si -|_i U Tj + i) for all i € {1,... , L — 1}, and Vf = Sj U Tj for i = L. We say that 
the nodes in V' constitute the i th level of this decomposition. We also denote the level of a node 
v £V' by £(v). Thus, we have £(v) = i whenever v € Vf, and the set of nodes V' is partitioned into 
L subsets V {,..., Vf. 

Theorem 6.6 and Corollary 6.7 will play the main role in the rest of this section. Roughly 
speaking, they are generalizations of Theorem 2.6 and Corollary 2.7 from Section 2.2, and they 
state that we can use the (a, d$', dx', .^-decomposition to 4a(1 + e) 3 / 2 -approximate of the value of 
the densest subgraph of G = (V,E). All we need to do is to set L = 0(logn/e) and try different 
values of ds>,dx> in powers of (1 + e). 

Theorem 6 . 6 . Fix any a > 1, ds>,dx> > 0, e € (0,1), and L 4 — 2 • (2 + |dog( 1+e ) n]). Let 
(Si,... ,Sl), (Ti, ... ,Tl) be an (a, ds >, dx>, L)-decomposition of the derived graph G’ = ( V',E') as 
per Definition 6.5. Then the following conditions are satisfied. 

1. If ds'dx 1 > 4 (l-t-e)A 5 /Aj’/, then Vf = 0 (i.e., the topmost level of the decomposition is empty). 

2. If ds' < A s'/a and dx> < A x’/ot, then Vf f 0 (i.e., the topmost level is nonempty). 

Proof. 

1. For any level k £ {2, ...,L}. Definition 6.5 states that for every node s £ S^, we have 
D s (Tk- 1 ) > dg’■ We thus get a lower bound on the number edges going from S& to 1 . 

\E’(S k ,T k _i)\>\S k \-d s , (50) 
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Recall that by Lemma 6.4, we have p(G) < 2 • \JXg/Xx'■ This implies that pf{Sk,Tk- 1 ) = 
\E'(S k ,T k _ 1 )\/y/\S k \\T k _ 1 \ < p(G) < 2 • y/Xs'Xr' < Vd S 'd T '/{l + e). Hence, we get: 

I E'(g t .T t -i)i<y / i^ l ' |rfc - 1 ll + ^' l ' |,fr ' i (51) 

From equations 50 and 51, we get: 

I'S'fcl • d S ’ < |T fc _i| • d T '/(l + e) (52) 

Replacing S by T in the above argument, we can analogously show that: 

\Tk\ • d T ’ < I'S'fc— 1 1 • ds'/{l + e) (53) 

Now, we consider two possible cases. 

• If |Sfc_i| • ds' > |Tfc_i| • dr ', then equation 52 implies that \Sk\ < |S’/ c _i|/(l + e). 

• Else if |5fc_i| • ds> < |Tfc_i| • dr 1 , then equation 53 implies that |T*.| < |Tfc_i|/(l + e). 

In other words, when going from level k — 1 to level k, either the size of S or the size of T 
reduces by a factor of (1 + e). Since |Si| = |Ti| = n and L = 2 • (2 + |~log( 1+e ) n |), after L 

levels both the sets S and T are empty. We thus have Sl = Ti = 0. 

2. Consider the densest subgraph of the input graph G = (V,E), given by the pair ( S*,T *) 
where S* C S' and T* C T'. Lemma 6.4 states that each node s € S* (resp. t € T*) has 
out-degree (resp. in-degree) at least A 5 / (resp. A t') with respect to the nodes in T* (resp. 
S*). Since X$' > a ■ ds' and A t 1 > a ■ dx’, Definition 6.5 implies the following guarantee: 

• If S* C Si, T* C Ti for some i G {1,..., L — 1}, then we also have S* C Si+ 1 , T* C Tj /+ \ . 

Since S\ = S ', T\ = T', we have S* C S\ and T* C Tj. Hence, applying the above guarantee 
in an inductive fashion, we conclude that S* C Sl and T* C T^. Thus, V[ = Sl U Tl / 0. 

□ 

Corollary 6.7. As in Theorem 6.6, fix any a > 1, e € (0,1) and L = 2 • (2 + |"log( 1+e ) n~|). 
Define X* = 1 — yH — 1/n. Discretize the range [A*/(a(l + e)),n 2 ] in powers of (1 + e), by let¬ 
ting = (14- e) k ~ 1 X*/a for every integer k > 0. Let K be the smallest integer k for which 

Qk > n 2 . For all dg^dx 1 € {do,... ,dx}, construct an (a, dgi , dx 1 , L)-decomposition of G' as per 
Definition 6.5, and let Vf(ds',dx ') denote the set of nodes at level L of such a decomposition. Let 
P = { (dgi , dx ') : dgi , dx 1 € {0,..., K}, Vf(dg>, dx>) 0} denote the set of those ( dg/,dx>) pairs for 
which the topmost level of the decomposition is non-empty. If P = 0, then define 7 = 0 . Else define 
7 = ma X(d s ,,d T ,)eP { ds> • d T '}- Then we have: 

2' \/(l+"e)'P(G) > sfi> JTXJiXej' 

Proof. If the derived graph is empty, i.e., if E' = 0, then we have E = 0, P = 0, 7 = 0, and 
p(G) = 0. Thus, in this case the corollary is trivially true. 

For the remainder of the proof, we assume that E' 0. Here, it is easy to see that P 0: 
simply consider the (a, dgi, dx 1 , L)-decomposition with dg / = dx> = go < 1. In this decomposition, 
every node in G' with nonzero degree will be promoted to the topmost level Vf(dg',dx')- 
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Accordingly, for the rest of the proof, we fix a (dgi, dy/)-pair in P for which dgi ■ d? > = 7- Let 
this pair be identified as (d* s ,, d/ r ) . Since (d* s ,,d/,) € P, we infer that Vf(d* s ,, d/,) / 0. Hence, by 
the first part of Theorem 6.6, we must have: 


7 = d* s , ■ d* T , < 4(1 + e) • \ s > ■ X T >. (54) 

Next, define dgi (resp. d?') to be the maximum value of dgi (resp. d?') which is less than the 
threshold A gi/a (resp. A T'/ a )- Note that since E' / 0, we have A* < As/, Ay/ < n 2 . Thus, we are 
guaranteed the existence of such a pair (dgi , dr >) • 

dgi = max {qy.\ and dr> = max {q^} (55) 

k£{0,...,K} : qy.<\ s //a k£{0,...,K} : qy.<X T i /a 


Next, note that qo < X*/a and qx > ri 2 . Since the consecutive qk values are within a factor of 
(1 + e) from each other, equation 55 gives us: 


dgi ■ dfi P 


Xgi ■ XT' 
a 2 ■ (1 + e) 2 


(56) 


Since dgi < X g>/a and dr> < Xt'/ol , Theorem 6.6 (part II) implies that Vf(dgi,dx') / 0- Thus, we 
have (dg', dr') € P. Since (d* s ,,d/,) € P maximizes the product of its two components, we get: 

7 = dgi ■ d * T / > dgi ■ d-T 1 (57) 


From equations 54, 56 and 57, we infer that: 

2 S /-. , r \ 2 < 7 < 4(1 + e) • Xgi ■ X T i (58) 

a z ■ (1 + e) z 

By the second part of Lemma 6.4, we have: p(G)/2 < ^/XgiXx' < p(G). Combining this observation 
with equation 58, we get: 

2o'c? e) - ^ - 2 • 77+7 ■ p(0) (59) 

This concludes the proof of the corollary. □ 


6.3 The Algorithm for Maintaining an (a, dgi, dr’, .^-Decomposition 

Throughout this section, we fix the values of dgi,dri,L. Furthermore, we fix an a > 2 + e. 
We describe an algorithm for maintaining an (a, dgi, dx’, L)-decomposition of the derived graph 
G' = (V', E') in a dynamic setting (see Definition 6.3). We assume that the input graph G = (V, E) 
is empty in the beginning, and hence, at that instant we also have E' = 0. Subsequently, at each 
time-step, a directed edge (u,v) is inserted into (resp. deleted from) the graph G = (V,E), and 
accordingly, the edge ( s u ,t v ) is inserted into (resp. deleted from) the derived graph G' = ( V',E'). 
Our main result is stated below. 

Theorem 6.8. For every polynomially bounded a > 2 + 3e, we can deterministically maintain an 
(a, dgi , dj-i , L)-decomposition of the derived graph G' = (V',E'). Starting from an empty graph, 
we can handle a sequence oft update operations (edge insertions/deletions) in total time 0(tL/e). 
Thus, we get an amortized update time of O(Lfe). The space complexity of the data structure at 
a given time-step is 0(n + m), where m = \E'\ denotes the number of edges in the derived graph 
at that time-step, and n = \V'\ denotes the number of nodes in the derived graph (which does not 
change over time). Note that \V\ = 2 n and \E\ = m, where G = (V,E) is the input graph. 

The proof of Theorem 6.8 is very similar to the proof of Theorem 4.2 from Section 4.1. Never¬ 
theless, for the sake of completeness, we highlight the main parts of the algorithm and its analysis. 
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Data Structures. Recall the concept of the level of a node from Definition 6.5. We now sepa¬ 
rately describe the data structures associated with the nodes in S' and T'. 

• Every node s G S' maintains L lists Friends* [s], for i G {1,... , L}. For i < £(s), the list 
Friends* [s] consists of the neighbors of s that are at level i : these are nodes belonging to 
the set A f s (V{ fl T*). For i = £(s), the set Friends*[s] consists of the neighbors of s that 
are at level i or above: these are the nodes belonging to the set J\f s {Ti). For i > £(s), the 
list Friends* [s] is empty. Each list is stored in a doubly linked list together with its size, 
Count* [s], Using appropriate pointers, we can insert or delete a given node to or from a 
concerned list in constant time. The counter Level [s] keeps track of the level of the node s. 

• Analogously, every node t € T' maintains L lists Friends* [f], for i G {1,..., L}. For i < £{t), 
the list Friends* [t] consists of the neighbors of t that are at level i: these are nodes belonging 
to the set A/)(U/ n S'*). For i = £(t), the set Friends* [t] consists of the neighbors of t that 
are at level i or above: these are the nodes belonging to the set A/)(S*). For i > £(t), the 
list Friends* [t] is empty. Each list is stored in a doubly linked list together with its size, 
Count* [t]. Using appropriate pointers, we can insert or delete a given node to or from a 
concerned list in constant time. The counter Level [t] keeps track of the level of the node t. 

The Algorithm. If a node violates one of the conditions of an (ct, ds>, dr> , ^-decomposition (see 
Definition 6.5), then we call the node “dirty”, else the node is called “clean”. Specifically, consider 
two possible cases depending on the type of the node. 

• A node s G S' at level £(s) = i is dirty iff either (a ) i < L and D S (T*) > a ■ ds>, or (b) i > 1 
and ZLj(T*_i) < d S ’ ■ 

• A node t G T' at level £{t) = i is dirty iff either (a) i < L and D t (Si ) > a ■ djv, or (b) i > 1 
and Dt(Si-i) < dr 1 - 

Initially, the derived graph G' = (V',E') is empty, every node is at level 1, and every node is 
clean. When an edge (s,i), s G S',f G T', is inserted/deleted in the derived graph G’, we first 
update the Friends lists of s and t by adding or removing that edge in constant time. Next we 
check whether s or t becomes dirty due to this edge insertion/deletion. If yes, then we run the 
RECOVER-DIRECTED0 procedure described in Figure 5. Note that a single iteration of the 
While loop (Steps 01-15) may change the status of some more nodes from clean to dirty (or vice 
versa). If and when the procedure terminates, however, every node is clean by definition. 

Space complexity. Since each edge in G' = (V 7 , E ') appears in two linked lists (corresponding 
to each of its endpoints), the space complexity of the data structure is 0(n + m). 

Analysis of the Update Time. Handling each edge insertion/deletion takes constant time plus 
the time for the RECOVER-DIRECTED0 procedure. We show below that the total time spent in 
procedure RECOVER-DIRECTED0 during t update operations is 0(tL/e). 

Potential Function. To determine the amortized update time we use a potential function B 
that depends on the state of the (a, ds’, g?t', .^-decomposition. For any two nodes s G S',t G T 1 , 
let f(s,t ) = 1 if l(s) = l(t ) and 0 otherwise. We define B, the node potentials <h(x) (for each 
x G V' = S' U T'), and the edge potentials ^(5,0 (for each ( s,t ) G E') as follows. 
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03. 

04. 

05. 

06. 

07. 

08. 

09. 

10 . 

11 . 

12 . 

13. 

14. 

15. 


While there exists a dirty node y € V' = S' U T'\ 

If y € S', Then 

If D y (Tg( y }) > a- d s > and £(y) < L, Then 

Increment the level of s by setting £(y) •(— £{y) + 1. 
Update the relevant data structures to reflect this change. 
Else if D y (T^_{) < ds> and £{y) > 1, Then 

Decrement the level of y by setting £(y) 4— £(y) — 1. 
Update the relevant data structures to reflect this change. 
Else if y e T', Then 

If Dy(Sf^) > a ■ dr' and £(y) < L, Then 

Increment the level of s by setting £(y) ■(— £(y) + 1. 
Update the relevant data structures to reflect this change. 
Else if D y (S^ y ' ) _ 1 ) < dr> and £(y) > 1, Then 

Decrement the level of y by setting £(y) •(— £(y) — 1. 
Update the relevant data structures to reflect this change. 


Figure 5: RECOVER-DIRECTED(). 


B 

<h(s) 

m 


x&S'UT' (s,t)£E' 



D s (Ti )) for all nodes s € S' 


/1 \ ^ 1 

f - j • max(0, a ■ dr’ — D t (Si )) for all nodes t G T' 
2 ■ (L — min (£(s),£(t))) + f(s,t ) for all edges (s,t) € E' 


(60) 

(61) 

(62) 

(63) 


It is easy to check that all these potentials are nonnegative, and that they are uniquely defined 
by the (a, d$', dy, L)-decomposition under consideration. Now, mimicking the potential function 
based analysis from Section 4.1, we can infer the following facts. 

• (FI) In the beginning, when the derived graph G' = (V',E') is empty, we have B = 0. 
Subsequently, the potential B remains always nonnegative. 

• (F2) Insertion/deletion of an edge in the derived graph G' = (' V',E') increases the potential 
B by at most 3 L/e. 

• To analyze the amortized running time of the RECOVER-DIRECTED() procedure, we have 
the following claims. 

— (F3) Consider a single iteration of the While loop in Figure 5 where a node s € S' with 
£(s) = i changes (increments or decrements) its level by one. This takes 0(1 + D S (T {)) 
time. On the other hand, the net drop in the overall potential B due to the same iteration 
of the While loop is D(1 + D s (Ti)), provided a > 2 + 3e. 

— (F4) Consider a single iteration of the While loop in Figure 5 where a node t € T' with 
£{t) = i changes (increments or decrements) its level by one. This takes 0(1 + Dt(Si)) 


43 



time. On the other hand, the net drop in the overall potential B due to the same iteration 
of the While loop is 0(1 + D t (Si)), provided a > 2 + 3e. 

Facts (FI) - (F4) imply that the RECOVER-DIRECTEDQ procedure takes a total time of 0{tL/e) 
to handling the first t edge insertions/deletions in the derived graph G' = (V',E f ). This gives an 
amortized update time of 0(L/e) for our algorithm and concludes the proof of Theorem 6 . 8 . 

6.4 Wrapping Up: Proof of Theorem 6.2 

We fix a sufficiently small constant e € (0,1) and set a = 2 + 3e, L = 2 • (2 + |~log( 1+e ) n |), and 
A* = 1 — 1 /\Jn. Next, as in Corollary 6.7, we discretize the range [A*/(ce(l + e)),n 2 ] by setting 
= (l+e) fe_1 • A * /a for every integer k > 0. We then define K to be the smallest integer k for which 
qk > n 2 . Next, we maintain an (a, dg/, dr 1 , ^-decomposition of the derived graph G' = (V', E') for 
every dg/,d-r/ € {do, • • • ,dx}- By Theorem 6 . 8 , maintaining each of these decompositions requires 
0(m + n ) space and 0(L/e ) amortized update time. Hence, the total space requirement of our 
scheme is 0{K 2 {m + n)) = 0(m + n ) and the total amortized update time is 0(K 2 L/e) = 0(1). 
We also maintain the value of 7 as defined in Corollary 6.7. Since there are 0(K 2 ) decompositions, 
maintaining the value of 7 also requires 0{K 2 ) = 0(1) update time. By Corollary 6.7, the quantity 
y/^j/ (2\/l + e) gives a 4a ■ (1 + e ) 3 / 2 = 8 • (1 + 0(e))-approxinration to the value of the densest 
subgraph p(G). This concludes the proof of Theorem 6.2. 

7 Sublinear-Time Algorithm 

In this section, we focus on sublinear time algorithms for the approximate densest subgraph prob¬ 
lem. Our main results are summarized in Theorems 7.1 and 7.3. 

If we assume that an algorithm has to read all of its input, then no sublinear (in the input 
size) time algorithm is possible. However, if we assume that the input is given by an oracle that 
gives efficient access to the input, then sublinear time algorithms might exist. We present in the 
following such an oracle that allows us to turn our algorithm from Section 3 into a sublinear time 
algorithm. Specifically, we will give an 0(n) time algorithm that requires 0(n) oracle queries and 
space. Afterwards we will also show that with this oracle no further assymptotic improvement is 
possible. 

Oracle model. We first present the oracle model for the input graph. It is a standard represen¬ 
tation that is, e.g., assumed in the sublinear time algorithms of [10, 19] and is called incident-list 
model. In this representation, we allow two types of accesses to the input graph (called oracle 
queries ): ( 1 ) the degree query which asks for the degree of some node v. and ( 2 ) the neighbor query 
which asks for the i th neighbor of v (i.e. the i th element in the incidence list corresponding to the 
neighbors of v ). See, e.g., [ 12 , 21 , 22 , 36, 38, 39] for further surveys. 

Upper Bound. In Section 3, we showed how to compute a (2 + e)-approximate solution to the 
densest subgraph problem using only O(n) edges sampled uniformly at random. In the above oracle 
model, sampling an edge can be done using one neighbor query. Thus, the algorithm needs only 
0(n ) queries. After the sampling is completed we can process the collection of sampled edges using 
0{n) time and space, as in the proof of Theorem 3.1, simply by computing the (1 + e,d,0( 1))- 
decomposition for different 0(1) values of d , to get the desired (2 + e)-approximate solution. This 
leads to the following theorem. 
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Theorem 7.1. There is a sub-linear time algorithm for computing a (2+e)- approximate solution to 
the densest subgraph problem in the incidence-list model. The algorithm makes 0(n ) oracle queries, 
and requires 0(n ) time and 0(n ) space. 

Lower Bound. We adapt the proof of [6, Lemma 7] to show that for any A > 3/2, a A- 
approximation algorithm needs to make Pt(n/{ A 2 poly log(ra))) oracle queries. Consider the following 
communication complexity problem “PI”: 

• There are k >2 players, denoted by pi,... ,pi- and an n-node input graph G consisting of t 
disjoint subgraphs, denoted by G\, ..., Gg. Each G* has k nodes, denoted by {ui t i,..., Uiy} 
(thus n = ktj. Further each subgraph is either a star or a clique. For any node ugj in Gj, if its 
degree is more than one then player pj knows about all edges incident to u t .j. In other words, 
Pj knows about edges incident to nodes with degree more than one among u\ t j, U 2 } j, ■ ■ ■, ugj. 
The players want to distinguish between the case where there is a clique (thus the densest 
subgraph has density at least (k — 1)/2) and when there is no clique (thus the densest subgraph 
has density at most 1). Their communication protocol is in the blackboard model, where in 
each round a player can write a message on the backboard, which will be seen by all other 
players, and the communcation complexity is the number of bits written to the board. Using 
a reduction from the multi-party set disjointness problem, the papers [6, 8] showed that this 
problem require Q(£/k) = Cl(n/k 2 ) communication bits. 

Lemma 7.2. If there is a sublinear-time algorithm with q oracle queries for the problem PI defined 
above, then the problem PI can also be solved using 0(q ) communication bits. 

Proof. Let A be such algorithm. Player p\ simulates A by answering each query of A using 0(1) 
communication bits, as follows. If A makes a degree query on node Uij, player p± will ask for an 
answer from player pf either pj knows all edges incident to Uij (in which case the degree of u^j 
is k) or the degree of ugj is one. If A asks for the t th neighbor of node , player p\ asks for this 
from player pj. If player pj does not know the answer, then we know that the degree of is one 
and Gi is a star. In this case, player p\ writes on a blackboard asking for the unique node u^y in 
Gj whose degree is more than one. Then, the only edge incident to u t j is UijUiy. This edge can be 
used to answer the query. □ 

Note that any ((A; — l)/2 — e)-approximation algorithm for the densest subgraph problem solves 
problem PI. Thus, the above lemma implies that any ((& — l)/2 — e)-approximation algorithm 
requires Q(n/k) queries. By considering any k > 4, we get the following theorem. 

Theorem 7.3. In the incidence-list model, for any A > 3/2 and any e > 0, any A — e-approximation 
algorithm for the densest subgraph problem needs to make Cl(n/ A 2 ) queries. 

8 Distributed Streams 

In the distributed streaming model (see, e.g., [11]), there are k sites receiving different sequences 
of edge insertions (without any deletion), and these sites must coordinate with the coordinator. 
The objective is to minimize the communication between the sites and the coordinator in order 
to maintain the densest subgraph. We sample 0(n ) edges (without replacement) as a sketch by 
using the sampling algorithm of Cormode et al. [11]: their algorithm can sample 0(n) edges using 
0(k + n) bits of communication, whereas the coordinator needs 0(n) space and each site needs 
0(1) space. The coordinator can then use this sketch to compute a (2 + e)-approximate solution. 
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Theorem 8.1. In the distributed streaming setting with k sites [11], we can compute a (2 + e)- 
approximate solution to the densest subgraph problem using 0(k + n) bits of communication. The 
coordinator needs 0(n) space and each site needs 0(1) space. 

9 Open problems 

An obvious question is whether the (4 + e) approximation ratio provided by our algorithm is tight. 
In particular, it will be interesting if one can improve the approximation ratio to (2 + e) to match 
the case where an update time is not a concern. Getting this approximation ratio even with larger 
space complexity is still interesting. (Epasto et al. [14] almost achieved this except that they have 
to assume that the deletions happen uniformly at random.) It is equally interesting to show a 
hardness result. Currently, there is only a hardness result for maintaining the optimal solution [23]. 
It will be interesting to show a hardness result for approximation algorithms. Another interesting 
question is whether a similar result to ours can be achieved with polylogarithmic worst-case update 
time. Finally, a more general question is whether one can obtain space- and time-efficient fully- 
dynamic algorithm like ours for other fundamental graph problems, e.g. maximum matching and 
single-source shortest paths. 
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